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ABSTRACT 


Remotely  sensed  data  produced  by  hyperspectral  imagers  contains  hundreds  of 
contiguous  narrow  spectral  bands  at  each  spatial  pixel.  The  substantial  dimensionality 
and  unique  character  of  hyperspectral  imagery  requires  display  techniques  that  differ 
from  traditional  image  analysis  tools. 

This  study  investigated  the  appropriate  methodologies  for  displaying 
hyperspectral  images  based  on  the  physical  principles  of  human  color  vision  and  a 
generalized  set  of  linear  transformations.  Principal  components  (PC)  analysis  is  a 
powerful  tool  for  reducing  the  dimensionality  of  a  data  set,  and  PC-based  strategies  were 
explored  in  creating  a  broadly  applicable  image  display  strategy.  It  is  shown  that  the 
invariant  display  strategy  and  generalized  eigenvectors  developed  within  this  study  offer 
a  first  look  capability  for  a  wide  variety  of  spectral  scenes.  PC  transformations  utilizing 
this  generalized  set  of  eigenvectors  allow  for  ‘real  time’  initial  classification.  Detailed 
investigation  of  the  relationship  between  the  PC  eigenvectors  and  dissimilar  image 
content  shows  that  this  strategy  is  robust  enough  to  provide  an  accurate  initial  scene 
classification. 
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EXECUTIVE  SUMMARY 


An  invariant  methodology  has  been  developed  in  response  to  the  need  for 
coherent  and  consistent  display  of  the  hundreds  of  contiguous  narrow  spectral  bands 
present  within  hyperspectral  imagery.  The  methodology  builds  upon  traditional 
hyperspectral  imagery  processing  techniques  and  provides  a  more  robust  first  look 
capability  for  unsupervised  classification. 

This  study  investigated  the  appropriate  techniques  for  displaying  hyperspectral 
images  based  on  the  physical  principles  of  human  color  vision  and  a  generalized  set  of 
linear  transformations.  Principal  components  (PC)  analysis  reduced  the  dimensionality 
of  a  data  sets,  and  PC-based  strategies  were  explored  in  creating  a  broadly  applicable 
image  display  strategy.  Analysis  of  hyperspectral  images  and  display  strategies  were 
accomplished  utilizing  MATLAB  and  ENVI  software. 

From  analysis  and  comparison  of  imagery  data  from  Davis-Monthan  Air  Force 
Base  to  image  data  obtained  from  similar  and  dissimilar  scenes,  it  is  clear  to  see  that  for 
comprehensive  analysis,  it  would  be  appropriate  to  maintain  scenes  such  as  Davis- 
Monthan  (which  consists  of  desert  background)  within  one  group  and  scenes  such  as 
Jasper  Ridge  (forest)  and  Lake  Tahoe  (forest/water)  within  another  group.  But,  for  first 
order  unsupervised  classification,  the  first  few  eigenvalues  and  associated  eigenvectors 
which  contain  the  largest  amount  of  scene  variance  can  appropriately  represent  a  scene. 
Extending  this  concept  further,  it  is  clear  that  a  generalized  set  of  eigenvectors  can  depict 
any  scene  content.  The  average  eigenvectors  investigated  in  this  study  provides  such  a 


xv 


basis  and  can  be  further  improved  upon  with  an  increase  in  the  number  of  data  sets 
utilized. 

A  principal  component-based  mapping  strategy  provides  an  easy  way  to  perform 
first  order  unsupervised  display.  The  inclusion  and  utilization  of  generalized 
eigenvectors  decreases  the  overhead  required  to  perform  first  order  display  and  will  allow 
for  ‘real  time’  classification  of  hyperspectral  imagery.  By  visually  inspecting  the 
resultant  image,  an  analyst  can  then  direct  attention  to  appropriate  areas  of  the  scene  for 
further  processing  without  the  time  consuming  requirement  of  calculating  the  scene 
specific  statistics.  The  generalized  PC  and  RGB  transformation  eigenvectors  utilized  in 
this  study  can  be  applied  to  a  broad  range  of  spectral  imagery  classes.  These 
eigenvectors  can  become  even  more  robust  as  the  number  of  ‘averaged’  scenes  is 
increased. 

The  1st  PC  will  always  be  related  to  the  mean  solar  radiance,  but  the  2nd,  3rd  and 
subsequent  PCs  depend  on  the  specific  contents  of  the  image.  However,  it  is  also  shown 
that  only  the  first  three  PCs  are  required  for  a  color  mapping  corresponding  to  human 
color  vision.  It  remains  to  be  investigated  whether  or  not  the  RGB  transformation  of  the 
HSV  image  presented  here  can  be  arranged  so  that  materials  are  presented  in  a 
straightforward  manner,  i.e.  water  always  mapped  to  blue,  vegetation  to  green,  etc,  vice 
having  the  dominant  scene  constituent  set  the  base  hue  of  the  image. 

The  presentation  strategy  discussed  here  is  best  suited  to  broadscale  geographical 
classification,  not  for  identifying  small,  isolated  targets.  However,  objects  and  variances 
within  the  scene  which  occur  only  at  a  few  pixels  in  an  image  and  thus  have  little  effect 
on  the  overall  covariance  matrix  and  do  not  contribute  significantly  to  the  2nd  and  3rd  PCs, 

xvi 


do  appear  to  be  discemable  in  this  mapping  strategy.  For  this  reason,  this  aspect  of  the 
mapping  strategy  merits  further  investigation. 

The  invariant  display  strategy  and  generalized  eigenvectors  presented  here  is 
offered  as  a  way  to  have  a  first  look  at  a  wide  variety  of  spectral  scenes.  By  performing  a 
PC  transformation  with  these  eigenvectors  and  analyzing  the  three  most  significant  PCs, 
an  initial  classification  decision  can  be  made  ‘real  time’. 
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I.  INTRODUCTION 


The  introduction  of  imaging  spectroscopy  with  the  Airborne  Imaging  Spectrometer 
in  1982  established  a  powerful  new  tool  for  the  earth  sciences,  but  also  created  a 
fundamentally  new  class  of  data  requiring  new  approaches  to  information  extraction  and 
display  methodologies  (Vane  and  Goetz,  1988,  p.l).  This  new  class  of  data  provides  a 
representation  of  the  spectral  character  of  materials  on  the  ground  and  will  be  referred  to 
as  spectral  imagery  throughout  the  study.  Hyperspectral  data,  a  particular  type  of  spectral 
imagery,  is  produced  when  solar  electromagnetic  energy  reflected  form  the  earth’s 
surface  is  dispersed  into  many  contiguous  narrow  spectral  bands  by  an  airborne 
spectrometer  (Vane  and  Goetz,  1988,  p.  3).  Each  picture  element  (pixel)  of  a 
hyperspectral  image  can  be  thought  of  as  a  high  resolution  trace  of  radiation  versus 
wavelength,  or  a  spectrum  (Rinker,  1990,  p.  6).  The  characteristic  wavelength  dependent 
changes  in  the  emissivity  and  reflectivity  of  a  given  material  can  be  related  to  the 
chemical  composition  and  types  of  atomic  and  molecular  bonds  present  in  that  material 
(Gorman,  et.  al.,  1995,  p.  2805).  The  chemical  composition  of  different  materials  is  thus 
manifested  in  the  spectral  properties  of  these  materials,  and  can  serve  as  a  means  of 
differentiating  materials  observed  in  a  hyperspectral  image  with  great  detail. 

Analysis  and  display  of  hyperspectral  imagery  is  complicated  by  several  factors. 

The  first  is  the  volume  of  data  inherent  in  a  hyperspectral  image.  A  typical  224-band 

Airborne  Visible/Inffared  Imaging  Spectrometer  (AVIRIS)  image  contains  approximately 

134  Mbytes  of  data  (Roger  and  Cavenor  1996).  Algorithms  for  processing  data  sets  of 

this  magnitude  must  be  computationally  efficient  to  be  of  any  service  and  if  possible, 

must  seek  to  eliminate  redundant  data  prior  to  processing.  The  second  factor  is  that  the 
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radiances  recorded  at  the  spectrometer  output  are  subjected  to  noise,  additive  and 
possibly  multiplicative  (Tyo,  et.  al.,  2000),  from  the  atmosphere,  sensor  instrumentation, 
data  quantization  procedure,  and  transmission  back  to  earth.  The  cumulative  effect  of 
these  noise  terms  is  a  spectrum  that  has  been  corrupted  by  noise  which  impacts 
meaningful  image  representation  and  target  detection  becomes  even  more  complicated.  It 
is  here  where  a  signal  processing  point  of  view  is  helpful,  as  the  problem  has  now 
become  the  classical  signal  in  noise  problem.  The  third  factor  is  that  because  of  the  finite 
spatial  resolution  of  the  imaging  spectrometer  and  the  actual  ground  scene,  the  observed 
spectrum  for  a  pixel  may  not  be  that  of  a  single  material,  but  could  be  a  mixture  of 
several  different  materials  which  exist  within  the  spatial  dimensions  of  the  sensor’s 
ground  instantaneous  field  of  view  (GIFOV).  Although  the  third  factor  is  primarily 
concerned  with  target  detection  and  classification,  it  will  impact  the  overall  display 
representation.  The  sea  level  GIFOV  of  the  AVIRIS  sensor  at  sea  level  is  nominally  20m 
x  20m  (Farrand  and  Harsanyi,  1995,  p.  1566)  and  the  implication  is  that  several  materials 
could  contribute  to  the  observed  spectrum  for  that  pixel  depending  on  the  complexity  of 
the  ground  scene.  A  fourth  factor  that  complicates  analysis  efforts  is  that  spectra  of  the 
same  type  of  material  may  appear  very  different.  This  variability  within  the  spectra  of  a 
species  or  target  class  dictates  a  statistical  approach  vice  a  deterministic  one.  (Tyo,  et.  al. 
2000,  Kerekes,  et.  al.,  2000) 

There  are  many  types  of  data  processing  techniques  which  address  the  unique 
issues  raised  by  hyperspectral  imagery.  Many  grew  out  of  earlier  techniques  which  had 
been  successfully  applied  to  multispectral  imagery,  the  precursor  of  hyperspectral 
imagery.  Others  have  a  foundation  in  the  discipline  of  pattern  recognition.  Another 
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approach,  which  is  naturally  suited  to  the  task  of  detecting  signals  in  the  presence  of 
noise  and  multiple  interfering  signals,  is  based  on  signal  processing.  The  signal 
processing  approach  efficiently  handles  the  data  by  viewing  it  from  the  vantage  of 
vectors  and  matrices,  performs  processing  by  various  linear  transformations  and  will  be 
the  methodology  utilized  in  this  study. 

The  major  goal  of  this  study  is  to  expand  the  knowledge  and  methodology  of 
hyperspectral  image  display  strategies  and  a  secondary  goal  is  to  provide  a  mapping 
strategy  that  can  be  used  on  a  wide  variety  of  hyperspectral  images.  This  study  is 
organized  in  a  manner  that  will  facilitate  the  goal  of  an  orderly  approach  to  invariant 
display  strategies  for  hyperspectral  images.  Chapter  II  presents  an  overview  of 
hyperspectral  imagery  and  introduces  the  statistical  signal  processing  approach  to  data 
analysis.  Chapter  III  describes  human  vision  and  relates  it  to  a  statistical  signal 
processing  approach.  This  chapter  also  details  why  human  visual  perception  must  be 
accounted  for  in  any  display  methodology.  Chapter  IV  details  the  methods  utilized  in 
this  study  for  the  processing  of  hyperspectral  image  files  and  identifies  specific  types  of 
data  that  it  is  applied  to.  Chapter  IV  also  contains  an  analysis  of  the  various  display 
strategies  found  to  be  most  effective  for  a  variety  of  hyperspectral  image  types.  Chapter 
V  concludes  the  study  and  seeks  to  solidify  the  connections  between  specific 
hyperspectral  data  sets  and  the  most  appropriate  display  strategies. 
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II.  BACKGROUND 


A.  PROBLEM  STATEMENT 

Within  the  past  several  decades,  many  strategies  for  target  identification,  material 
classification,  terrain  mapping,  etc.,  have  been  developed  to  exploit  the  information  in  the 
hundreds  of  spectral  samples  taken  at  each  pixel  in  a  scene.  Once  a  classification 
algorithm  or  image  processing  tool  has  been  applied  to  a  spectral  image,  the  resulting 
processed  data  is  invariably  mapped  into  a  pseudocolor  image.  While  many  display 
methodologies  are  quite  powerful,  there  is  no  standard  tool  used  to  render  spectral 
imagery  in  false-color  images  for  presentation. 

Currently,  the  use  of  false  color  displays  is  generally  reserved  as  a  tool  for 
presenting  data  after  processing.  Once  a  scene  has  been  classified  by  a  particular 
algorithm,  a  specifically  tailored  colormap  is  created  to  emphasize  the  performance  of  the 
classification  system.  Commonly,  in  an  attempt  to  distinguish  scene  elements,  one 
displays  the  data  as  an  initial  processing  step,  but  rarely  is  visualization  in  and  of  itself 
used  as  a  tool  that  allows  the  spectral  analyst  to  perform  identification  before  cueing 
more  powerful  processing  strategies.  Most  colormaps  in  use  today  have  been  developed 
based  on  the  mathematics  of  spectral  images  without  considering  the  workings  of  the 
human  vision  system.  It  has  been  demonstrated  that  failure  to  consider  how  the  observer 
processes  data  visually  can  make  information  difficult  to  find  in  an  image,  even  when  the 
data  are  clearly  available.  (Tyo,  et.  al.  1998) 
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B. 


HYPERSPECTRAL  IMAGERY  OVERVIEW 


In  order  to  fully  understand  the  need  of  an  invariant  false-color  mapping  strategy, 

a  review  of  the  historical  perspective  and  paradigms  in  the  analysis  of  hyperspectral 

images  is  necessary.  Figure  2.1  illustrates  the  major  image  analysis  paradigms  over  the 

past  seventy  years.  This  is  not  an  all  inclusive  history,  but  a  quick  synopsis  of  the  major 

ideas  behind  hyperspectral  imagery  analysis.  Note  that  there  is  no  visual  representation 

strategy  within  any  of  the  paradigms. 

Photointerpretation  (1930s) 

:  2-D  Images 

:  good  qualitative  analysis  (human) 

:  poor  quantitative  analysis 

Digital  Imagery  (1960s) 

:  2-D  Images 

:  Pattern  Recognition,  Computer  Vision 
:  Emphasis  on  Classification  Techniques 

Multispectral  Imagery  (1970s) 

:  3-D  Images 

:  Principal  Components  Analysis 
:  Land  Usage  Classification 

Hyperspectral  imagery  (1980s) 

:  3-D  ““  ' 

:  Need  to  reduce  data  dimensionality 
:  Software  Packages  with  Spectral  Libraries 
:  Need  efficient  processing  techniques 

Figure  2. 1 :  Major  Imagery  Analysis  Paradigms. 


The  analysis  of  imagery  began  in  the  early  part  of  this  century  with 
photointerpretation.  The  analysis  of  aerial  photographs  to  extract  information  of  interest 
was  a  strictly  human  operation.  The  strength  of  the  human  element  in  interpretation  was 
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the  ability  to  recognize  large  scale  patterns  (Richards,  1986,  p.  75)  and  make  inferences 
based  on  these  patterns.  However,  the  weakness  of  human  image  interpretation  is  the 
inability  to  accurately  quantify  the  results  in  a  consistent  manner. 

The  computing  power  that  began  to  become  available  in  the  1960’s  and  the  ability 
to  represent  data  in  a  digital  fashion  provided  the  impetus  for  automation  of  the 
photointerpretation  task  into  digital  imagery  analysis.  Here,  the  computer  was 
programmed  to  work  within  narrow  parameters,  such  as  counting  the  number  of 
occurrences  of  certain  brightness  values,  a  job  that  it  performed  more  quickly  and 
accurately  than  any  human  analyst.  The  fields  of  pattern  recognition  and  computer  vision 
became  important,  and  a  statistical  description  of  the  data  was  needed  to  form  the  basis  of 
classification  schemes  which  could  accurately  determine  the  number  of  pixels  in  the 
scene  belonging  to  a  certain  class.  Linear  prediction  and  spatial  principal  components 
analysis  (PCA)  were  tools  that  assisted  in  the  automated  detection  of  a  target  in  the  two- 
dimensional  digital  images. 

The  advent  of  multispectral  imagery  with  Landsat  data  in  the  1970’s  added  the 
spectral  dimension  to  the  problem  of  imagery  analysis,  (i.e.  If  the  number  of  spectral 
samples  at  each  pixel  is  N,  there  is  now  N  times  the  amount  of  data  for  analysis.)  PCA 
played  a  significant  role  in  reducing  the  dimensionality  of  the  data  (decrease  from  N 
number  of  samples  to  M<N  linear  combinations)  by  exploiting  redundancy  within  the 
data  and  assisted  in  the  classification  of  large  land  areas.  The  relationship  between  PCA 
techniques  and  classification  techniques  was  a  sequential  operation  in  that  PCA  was  first 
applied  to  an  image  to  remove  the  redundant  information  or  create  a  better  class 
separation  before  application  of  a  classifier.  This  preprocessing  application  of  PCA  still 
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continues  today.  It  is  important  to  note  also  that  PCA  in  MSI  and  HSI  is  performed  in  the 
spectral  dimension  while  in  pattern  recognition  and  photography,  it  is  performed 
spatially. 

The  1980’s  and  hyperspectral  imagery  ushered  in  a  new  challenge  to  the  existing 
methods  of  analyzing  data.  Hyperspectral  imagery  increased  the  number  of  spectral 
bands  from  less  than  10  to  more  than  200,  increasing  the  volume  of  data  by  a  factor  of 
20-50.  Thus,  data  compression  became  an  important  concern.  The  search  for  new 
techniques  to  deal  with  the  large  amount  of  information  and  commensurate  amount  of 
redundancy  prompted  new  views  of  the  analysis  paradigm.  Ideas  from  the  signal 
processing  community  provided  a  means  of  handling  the  large  amount  of  data  and 
confronting  the  mixed  pixel  problem.  Software  packages  dedicated  to  the  analysis  of 
hyperspectral  imagery,  such  as  ENVI,  incorporated  spectral  libraries  and  found  particular 
interest  in  the  geological  remote  sensing  community. 

C.  DEFINITIONS 

An  understanding  of  the  fundamental  ideas  behind  the  various  spectral  imagery 
analysis  techniques  is  important  because  it  forms  the  basis  for  all  imagery  analysis  and 
image  display  methodologies.  The  fundamental  ideas  involve  concepts  from  statistics, 
linear  algebra,  and  signal  processing  theory.  Discussion  of  these  ideas  in  the  context  of 
spectral  imagery  sets  the  stage  for  the  detailed  discussion  of  display  strategies  that  follow. 

This  section  presents  multispectral  and  hyperspectral  images  as  a  means  of 
further  highlighting  certain  properties  of  the  spectral  concept.  The  images  are  also 

8 


characterized  from  a  statistics  view  which  assists  in  better  understanding  the  image 
content  and  the  statistical  principles  used  in  spectral  imagery  analysis.  Some  concepts 
from  linear  algebra  and  signal  processing  are  defined  to  provide  a  framework  through 
which  certain  spectral  imagery  analysis  techniques  and  display  methodologies  are 
understood.  These  perspectives  offer  a  means  of  defining  key  concepts  that  appear 
throughout  this  study.  An  effort  has  been  made  to  make  these  definitions  simple  yet 
comprehensive  through  the  use  of  illustrative  examples. 

1.  Spectral  Imagery 

Spectral  imagery  is  the  acquisition  of  images  at  multiple  wavelengths  by 
spectrometers  onboard  aircraft  or  spacecraft.  Two  primary  classes  of  such  measurements 
are  the  traditional  multispectral  images,  as  with  those  produced  by  the  Thematic  Mapper 
(TM)  radiometer  on  the  Landsat  satellites,  and  hyperspectral  imagery,  produced  by 
imaging  spectrometers  such  as  in  the  Airborne  Visible/Inffared  Imaging  Spectrometer 
(AVIRIS)  and  Hyperspectral  Digital  Imaging  Collection  Experiment  (HYDICE)  systems. 
Typical  Landsat  TM  and  AVIRIS  images  will  be  used  here  to  introduce  many  of  the 
concepts  needed  for  this  study.  These  data  sets  will  also  be  used  to  illustrate  display 
strategies  in  future  sections.  The  Landsat  TM  scene  in  Figure  2.2  is  a  six-band,  640-pixel 
x  400-pixel  image  of  Canon  City,  Colorado  which  is  provided  with  ENVI  software  on  the 
ENVI-DATA  CD-ROM.  The  scene  is  an  image  of  a  city  surrounded  by  mountains.  The 
six  distinct  image  planes  present  in  Figure  2.2  represent  the  various  wavelength  ranges 
sensed  by  Landsat  TM.  Notice  how  objects  which  appear  bright  in  one  band  may  appear 
dark  in  another  band.  The  mountain  ridgeline,  found  on  the  left  side  of  the  image. 
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illustrates  this  effect.  Through  this  sort  of  contrasting.  Landsat  imagery  offers  a  very 
basic  means  of  discerning  the  spectral  character  of  a  particular  class  of  material. 


A  representative  AVIRIS  scene  of  Jasper  Ridge,  which  is  also  provided  on  the 
ENVEDATA  CD-ROM,  was  chosen  for  comparison.  The  scene  shows  a  biological 
preserve  surrounded  by  a  small  city.  Data  sets  of  this  area  are  typically  utilized  for 


vegetation  analysis. 


X=1 . 65  0 urn  X=2 .215um 


Figure  2.2:  A  Typical  6-Band  Multispectral  Image  Produced  by  Landsat  TM.  (Note 

different  shadings  between  bands.) 


Figure  2.3  shows  three  representations  of  the  hyperspectral  image  consisting  of 
300  samples,  250  lines,  and  224  bands.  This  first  image  is  a  grayscale  representation  of 
band  37,  the  second  image  is  a  red,  green,  blue  composite  formed  using  bands  176,  91, 
and  31,  and  the  third  image  is  a  red,  green,  blue  composite  formed  using  bands  25,  120, 
and  200.  The  Jasper  Ridge  representation  in  Figure  2.3  shows  only  a  small  subset  of  the 
wide  range  of  color  mappings  available  to  an  analyst. 


Figure  2.3:  Jasper  Ridge  Color  Representation.  Panel  a  is  an  achromatic  representation 
of  Band  37  (702. 5nm).  Panel  b  is  a  R-G-B  representation  with  Red  2208. 7nm,  Green 
1221. Onm  and  Blue  665. 7nm.  Panel  c  is  a  R-G-B  representation  with  Red  606. 4nm, 
Green  1483. 4nm  and  Blue  2268.4nm. 

One  way  of  visualizing  data  that  has  two  spatial  and  one  spectral  dimension  is  as 
a  cube.  Due  to  ‘finer’  resolution  of  spectral  frequencies,  the  ability  to  identify  materials 
based  on  spectral  detail  is  more  effective  with  hyperspectral  imagery  as  opposed  to 
multispectral  imagery  (Goetz,1995).  Figure  2.4  emphasizes  the  high  spectral  resolution 
of  hyperspectral  data  by  extracting  information  in  the  spectral  dimension,  or  downward  in 
the  axes  of  the  cube.  It  shows  the  construction  of  an  observed  spectrum  associated  with  a 
particular  spatial  location,  called  a  pixel  vector. 
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Figure  2.4:  The  Concept  of  a  Pixel  Vector.  From  Vane  and  Goetz,  1988,  P.2. 


The  pixel  vector  is  central  to  the  discussion  which  follows,  since  the  pixel  vector  may  be 
viewed  as  a  unique  signal  associated  with  a  material  of  interest.  Figure  2.5  further 
illustrates  the  pixel  vector  concept  using  randomly  chosen  observed  spectra  from  the 
Landsat  and  AVIRIS  images.  The  fine  spectral  detail  that  can  be  discerned  in  the 
hyperspectral  image  spectrum  is  a  stark  contrast  to  the  coarse  detail  that  comes  from  six 
data  points,  as  in  the  Landsat  observed  spectrum.  The  implication  is  that  the 
characteristic  shape  of  the  pixel  vectors  obtained  using  hyperspectral  imagery  allows  a 
more  definitive  identification  of  material  based  on  unique  spectral  characteristics. 

The  hyperspectral  sensors  AVIRIS  and  HYDICE  have  spectral  bands  that  are 
configured  to  cover  a  range  of  400  to  2500nm.  The  observations  of  the  reflected  energy 
at  the  sensor  are  measured  in  terms  of  radiance,  which  has  units  of  watts  per  square  meter 
per  steradian,  (Wm"2sr'!).  The  spectral  irradiance  is  how  much  power  density  is  available 
incrementally  across  the  wavelength  range  and  is  measured  in  (Wm‘2um"’).  (Richards, 
1986) 
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Figure  2.5:  Typical  Pixel  Vectors  From  Multispectral  and  Hyperspectral  Images. 


A  significant  portion  of  the  spectrum  imaged  in  the  AVIRIS  system  is  dominated 
by  solar  energy  reflected  from  the  earth’s  surface.  This  solar  energy  accounts  for  the 
characteristic  “hump”  in  roughly  the  15th  to  the  37th  bands  (500nm-700nm).  At  times,  it 
is  desirable  to  mitigate  the  effect  of  the  dominant  solar  curve  so  that  other  spectral  details 
may  be  discerned.  One  means  of  doing  so  entails  converting  radiance  measurements  to 
reflectance  measurements  by  dividing  the  radiance  observations  by  the  scene  average 
spectrum.  Other  methods  include  the  use  of  calibration  panels,  flat  field  calibration,  and 
numerical  techniques  (ATREM,  etc.)  The  net  effect  is  to  normalize  the  radiance 
measurements  in  such  a  manner  that  the  solar  bias  is  removed  and  the  resulting 
reflectance  spectrum  appears  flatter.  For  the  purposes  of  this  study,  raw  radiance  data 
will  primarily  be  utilized  because  analysis  of  this  data  will  limit  the  amount  of 
preprocessing  required  and  allow  for  a  better  understanding  of  overall  scene 
characteristics. 
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2.  Statistical  Interpretation 

In  order  to  assist  in  the  quantitative  discussion  of  characterizing  the  data 
statistically,  we  need  to  formally  define  the  concept  of  the  observed  pixel  vector.  Assume 
that  the  observed  pixel  vector  x  is  a  real  valued  random  vector 


where  the  components  {Xj,...,XL}  correspond  to  measured  brightness  values  in  each  of  L 
spectral  bands.  Since  a  stochastic  view  of  the  data  assumes  that  these  vectors  are 
random  entities,  one  means  of  characterizing  them  is  to  describe  their  behavior  using 
statistical  concepts.  Exact  statistical  descriptions  of  their  behavior  are  unavailable  in 
real  applications,  so  we  must  rely  on  methods  that  estimate  the  statistics  of  the  observed 
random  vectors. 

There  are  three  major  statistical  definitions  of  interest  in  this  respect.  The  first  is 
the  concept  of  expectation.  The  expectation  of  a  random  vector  is  called  the  mean  or  the 
average  value  that  the  random  vector  assumes  and  is  denoted  as  E{x}.  The  mean  is  also 
called  the  first  moment  since  it  involves  only  the  random  vector  itself  and  not  products  of 
the  components  of  the  vector  x  (Therrien,  1992,  p.  33).  In  using  the  observed  data,  it  is 
often  desirable  that  the  statistical  expectation  of  the  estimated  mean  equal  the  actual 
mean.  This  is  called  an  unbiased  estimate  of  the  mean.  The  framework  for  this 
estimation  is  to  view  the  spectral  image  or  scene  as  a  collection  of  N  random  pixel 
vectors.  This  implies  that  the  scene  is  comprised  of  N  pixel  vectors,  each  consisting  of  an 
L-band  spectrum.  The  unbiased  estimate  of  the  mean  spectrum  for  the  scene  is  given  by: 
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(2.2) 
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where  Xj  represents  the  spectrum  of  the  jth  pixel  of  the  scene.  The  mean  spectrum  vector, 
m,  of  Equation  2.2  can  also  be  interpreted  as  a  L-dimensional  vector  with  each 
component  representing  the  average  brightness  value  over  the  entire  image  for  one 
particular  band. 

The  second  definition  of  importance  in  characterizing  random  vectors  is  that  of 
the  covariance  matrix.  The  covariance  matrix  is  defined  in  vector  and  expanded 
component  form  as: 


Yux  =  E{(x-m)(x-m)T}= 

E{(x\-m\)2}  E{(x\-m\)(x2-m2)}  ...  E{(x\-m\)(xL -mL)} 

E{{x2-mi){x\-m\)}  E{(x2-m2)2}  ...  E{(x2-m2)(xL -mL)} 

E{{xL-mL){x\-m\)}  E{(xL-  mL  )(x2  -  m2)}  ...  E{(xL-mL)2} 


where  m  is  the  mean  vector  of  the  entire  image  defined  in  Equation  2.2.  The  covariance 
matrix  is  symmetric,  and  the  elements  of  the  main  diagonal  represent  the  variances 
associated  with  each  of  the  component  variables  of  the  random  vector  x.  In  the  case  of 
spectral  imagery,  the  variance  is  a  measure  of  how  the  brightness  value  of  a  particular 
band  varies  over  all  spatial  image  pixels  and  the  covariance  describes  the  scatter  of  pixel 
points  in  the  principal  components  vector  space. 

The  covariance  matrix  is  the  set  of  second  central  moments  of  the  distribution, 
which  are  also  referred  to  as  moments  about  the  mean  since  the  mean  component  is 
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subtracted  from  each  random  variable.  The  unbiased  estimate  of  the  covariance  matrix  is 


generated  by: 

N 

=  -^Z  (xj-m){xj-m)T  (2.4) 

where  xj  is  again  the  pixel  vector  associated  with  the  jth  spatial  location  (Richards,  1993, 
p.  134).  When  the  covariance  of  two  random  variables  is  zero,  then  the  random  variables 
are  said  to  be  uncorrelated,  which  implies  that  those  random  variables  were  generated  by 
separate  random  processes  (Leon-Garcia,  1994,  p.  337).  In  the  calculation  of  the 
unbiased  estimates  of  statistical  quantities,  the  computational  expense  of  the  covariance 
matrix  for  a  large  number  of  samples,  N,  must  be  balanced  with  the  desired  degree  of 
accuracy  for  the  estimate.  More  samples  imply  better  estimates,  and  in  order  to  ensure 
sufficient  accuracy,  the  number  of  samples  must  be  sufficiently  large  (Fukunaga,  1971, 
p.242). 

The  third  statistical  definition  involves  an  issue  that  requires  clarification 
regarding  the  term  “correlation”  matrix.  In  signal  processing  terminology,  the  correlation 
matrix  stated  as  E{xxT}  is  formed  exactly  as  the  covariance  matrix,  except  that  the  mean 
vector  is  not  subtracted  from  the  random  vector  x  (Therrien,  1992,  p.  33).  Figure  2.6 
demonstrates  the  concept  of  mean  removal  using  the  scatter  plots  of  two  bands  of  Landsat 
data.  The  scatter  plots  are  a  representation  of  many  two-dimensional  random  vectors 
which  have  a  two-dimensional  mean  vector.  The  subtraction  of  this  mean  vector  from 
every  random  vector  results  in  a  centering  of  the  data  about  the  origin.  This  introduces 
negative  numbers  into  the  previously  positive  data  values. 


16 


Canon  City  Scatierplot,  Bands  2  and  3  (mean  removed) 
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Figure  2.6:  Mean  Removal  Illustration  With  Scatter  Plots. 
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While  the  correlation  matrix  is  more  frequently  used  in  signal  processing  where 
zero  mean  signals  are  the  norm,  remote  sensing  uses  the  covariance  matrix  since  negative 
radiance  values  do  not  have  a  clear  physical  significance. 

In  statistical  and  remote  sensing  applications,  the  correlation  matrix  is  usually 
defined  in  terms  of  the  covariance  matrix.  The  ijth  element  of  the  statistical  version  of  the 
correlation  matrix  is: 

(2.5) 

where  Gy  is  an  element  of  the  covariance  matrix  and  is  the  covariance  between  bands  i 
and  j  in  £x,  a;;  represents  the  variance  of  the  ith  band  of  data,  and  the  square  root  of 
variance  is  defined  as  the  standard  deviation  (Richards,  1993,  p.  135). 

The  statistical  and  signal  processing  versions  of  correlation  do  not  produce  the 
same  matrix.  The  statistical  definition  produces  a  matrix  which  has  a  unit  main  diagonal 
and  can  be  represented  as: 

1  pn  ...  p\N 
Pi\  1  ...  Pin 

Pn\  Pn2  ...  1 

(Searle,  1982,  P.  348).  It  is  apparent  that  dividing  the  covariance  matrix  elements  by 

their  standard  deviations  has  the  effect  of  reducing  all  the  variables  to  an  equal 

importance  since  all  have  unit  variance.  The  signal  processing  definition  does  not 

produce  a  unit  diagonal  matrix,  though  it  is  symmetric.  The  off  diagonal  elements  of  Rx, 

represented  by  py  are  called  correlation  coefficients.  They  range  between  -1  and  +1  in 

value,  and  provide  a  measure  of  how  well  two  random  variables  vary  jointly  by 
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quantifying  the  degree  of  fit  to  a  linear  model  (Research  Systems,  Inc.,  1995,  p.  20-6).  A 
value  near  +1  or  -1  represents  a  high  degree  of  fit  between  the  random  variables  to  a 


positive  or  negative  linear  model,  whereas  a  values  near  zero  implies  that  the  random 
variables  exhibit  a  poor  fit  to  the  linear  model.  The  conclusion  that  may  be  drawn  is  that 
a  high  degree  of  fit  implies  well-correlated  random  variables,  whereas  a  correlation 
coefficient  of  zero  is  indicative  of  statistically  orthogonal  random  variables.  We  will 
assume  that  we  are  dealing  with  the  statistical  definition  of  the  correlation  matrix,  though 
a  more  descriptive  term  for  the  “correlation”  matrix  might  be  the  “normalized”  or 


“standardized”  covariance  matrix. 


The  definitions  of  statistical  properties  become  clearer  when  they  are  linked  to  a 


physically  observable  phenomenon.  The  next  few  illustrations  attempt  to  show  the  large 
amount  of  information  revealed  by  the  statistics  of  the  data.  Table  2.1  shows  the 


covariance  and  correlation  matrices  for  the  Landsat  data  of  Canon  City. 


Band 

1 

2 

3 

4 

5 

6 


Covariance  Matrix  for  Canon  City  TM  Data 


Band  1 
45.391430 
54.157121 
62.472157 
48.824654 
48.318837 
43.826169 


Band  2 
54.157121 
69.760492 
79.432844 
64.610172 
64.421129 
57.616349 


Band  3 
62.472157 
79.432844 
96.376932 
77.863810 
79.247 686 
70.393457 


Band  4 
48.824654 
64.610172 
77.863810 
100.500970 
74.711596 
57.981971 


Band  5 
48.318837 
64.421129 
79.247686 
74.711596 
87.056432 
70.991722 


Band  6 
43.826169 
57.616349 
70.393457 
57.981971 
70.991722 
63.739045 


Band 

1 

2 

3 

4 

5 

6 


Correlation  Matrix  for  Canon  City  TM  Data 


Band  1 
1.000000 
0.962418 
0.944524 
0.722881 
0.768651 
0.814786 


Band  2 
0.962418 
1.000000 
0.968744 
0.771633 
0.826653 
0.864049 


Band  3 
0.944524 
0.968744 
1.000000 
0.791159 
0.865166 
0.898138 


Band  4 
0.722881 
0.771633 
0.791159 
1.000000 
0.798735 
0.724444 


Band  5 
0.768651 
0.826653 
0.865166 
0.798735 
1.000000 
0.953025 


Band  6 
0.814786 
0.864049 
0.898138 
0.724444 
0.953025 
1.000000 


Table  2.1 :  Covariance  and  Correlation  Matrices  of  Landsat  TM  Data. 
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In  examining  the  Landsat  covariance  matrix,  we  see  that  the  highest  variance 
results  from  band  four,  the  lowest  covariance  is  between  bands  one  and  six,  and  the 
highest  covariance  is  between  bands  two  and  three.  The  correlation  coefficient  is  highest 
between  bands  two  and  three  and  is  lowest  between  bands  one  and  four.  We  can  draw 
some  conclusions  from  these  statistics.  First,  band  four  has  more  variance,  or  contrast 
over  the  scene,  than  any  other  band.  Before  we  assume  that  this  means  that  band  four  can 
detect  some  sort  of  unique  information  better  than  other  bands,  we  must  ask  if  this 
variance  was  caused  by  signal  coming  from  the  ground  or  if  it  was  noise  introduced  by 
our  sensor  or  the  atmosphere  in  that  particular  band.  If  we  know  the  signal-to-noise  ratio 
of  our  sensor  in  band  four  then  we  can  answer  the  question.  Signal-to-noise  ratio  (SNR) 
is  the  ratio  of  signal  power  to  noise  power,  and  can  be  obtained  using  the  variances  as  the 
power.  Second,  band  one  exhibits  the  lowest  correlation  coefficient  when  compared  to 
all  other  bands.  Again,  before  we  assume  that  band  one  detects  unique  information,  we 
must  ask  about  the  signal-to-noise  characteristics  of  band  one.  For  example,  if  band  one 
were  purely  noise,  then  it  would  exhibit  an  even  lower  correlation  with  other  bands, 
perhaps  even  zero.  This  is  because  it  is  independent  of  the  other  bands,  not  because  it 
carries  any  information. 

The  scatter  plot  is  another  means  of  characterizing  the  statistics  of  the  data  by 
visually  presenting  the  two-dimensional  distribution  of  pixels  using  two  selected  bands. 
Two  band  combinations  are  shown  in  Figure  2.7.  The  scatter  plot  is  a  representation  of 
all  of  the  two-dimensional  random  pixel  vectors  formed  by  the  two  bands  of  interest.  By 
plotting  the  data  of  one  band  against  that  of  another,  information  regarding  the  statistical 
similarity  of  bands  may  be  inferred.  The  scatter  plots  for  the  Landsat  image  show  a 
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definite  linear  feature  when  a  high  correlation  coefficient  exists,  as  between  bands  two 
and  three.  Thus,  bands  two  and  three  are  statistically  similar,  to  the  extent  that  there 


This  graphically  depicts  the  more  independent  and  less  correlated  nature  of  the  data  in 
band  four,  as  evidenced  by  the  lower  correlation  coefficient  of  0.7229.  The  scatter  plot 
also  clearly  shows  groupings  of  pixels  that  have  the  most  variance  and  will  form  the  basis 
for  the  studies  false-color  mapping  strategy. 

In  order  to  show  the  second  order  statistics  of  a  hyperspectral  image,  another 
visualization  technique  is  introduced.  With  224  bands,  manually  examining  the 
covariance  matrix  would  be  tedious,  and  comparing  two  bands  at  a  time  with  scatter 
plots  would  be  similarly  ineffective  and  time  consuming.  For  hyperspectral  data 
statistics,  the  elements  in  the  covariance  matrices  are  assigned  color  values 
corresponding  to  their  value.  The  result  is  a  color  matrix  which  helps  in  explaining 
trends.  Figure  2.8  illustrates  the  covariance  and  correlation  matrices  for  the  radiance 
data  in  the  AVTRIS  Jasper  Ridge  scene. 


Covariance  Correlation 


50  100  150  200  50  100  150  200 


Figure  2.8:  Second  Order  Statistics  of  the  AVTRIS  Jasper  Ridge  Scene. 

There  are  several  notable  features  in  the  two  matrices.  In  the  radiance  covariance 
matrix,  we  see  the  effect  of  the  sun  on  bands  50  to  70  manifested  in  the  higher  variance 
and  covariance  values.  This  is  because  the  covariance  matrix  is  constructed  in  a  manner 
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that  uses  the  absolute  radiance  values,  which  are  very  large  in  these  bands  for  radiance 
data.  The  correlation  matrix  of  the  radiance  does  not  show  this  uneven  weighting  of 
variances.  Instead,  the  correlation  coefficients  closest  to  the  main  diagonal  exhibit  a 
fairly  similar  value  over  all  image  bands,  indicating  that  the  correlation  matrix  has 
normalized  the  variances  and  covariances  with  respect  to  their  standard  deviations.  The 
high  values  in  the  vicinity  of  the  main  diagonal  are  indicative  of  an  important 
characteristic  of  hyperspectral  imagery,  namely  the  high  correlation  between  adjacent 
bands.  Both  of  the  matrices  show  the  effects  of  the  absorption  bands  as  areas  of  very 
low  covariances  and  correlation  coefficients.  This  is  intuitively  pleasing,  since  the 
absorption  bands  should  be  very  uncorrelated  with  all  other  bands.  These  dark  vertical 
and  horizontal  features  on  the  matrices  represent  the  presence  of  atmospheric  absorption 
features  and  are  a  good  illustration  of  the  effect  of  additive  noise.  The  bands 
corresponding  to  these  absorption  features  have  had  the  “signal”  drowned  out  by 
“noise”  introduced  by  the  atmosphere.  This  is  multiplicative  in  nature,  the  additive 
noise  is  introduced  at  the  sensor.  Note  also  that  the  main  diagonal  trace  is  specifically 
Zcr2/,  and  represents  the  variance  associated  with  each  band. 

The  blocky,  segmented  nature  of  the  second  order  statistics  matrices  reveals 
important  details  about  the  scene.  The  low  covariances  in  the  absorption  bands  are 
easily  explained  because  the  brightness  values  in  those  bands  are  so  statistically 
different  than  all  other  bands.  More  subtly,  these  matrices  show  the  degree  of  difference 
or  similarity  between  the  brightness  values  in  other  parts  of  the  observed  spectra. 

In  order  to  illustrate  this  concept,  a  Davis-Monthan  Air  Force  Base  HYDICE 
radiance  data  set  is  introduced.  Figure  2.9  shows  the  covariance  and  correlation 
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matrices  for  this  data  set.  The  scene  is  a  good  contrast  to  the  Jasper  Ridge  image 
because  the  predominant  background  material  is  sand  instead  of  vegetation. 


Figure  2.9:  Second  Order  Statistics  of  a  HYDICE  Davis  Monthan  Scene. 


Recalling  the  plots  of  various  pixel  vectors  seen  in  Figure  2.5,  note  how  the 
spectrum  of  the  vegetation  sharply  spiked  up  at  wavelength  700nm  whereas  the 
spectrum  of  the  road  remains  relatively  unchanged.  This  corresponds  to  the  chlorophyll 
absorption  band  edge  that  occurs  at  a  wavelength  of  about  700nm.  In  Figure  2.8  note 
how  a  “block”  of  high  covariances  rapidly  transitions  to  a  “block”  of  low  covariances  at 
band  55.  This  feature  is  an  indicator  of  the  fact  that  there  are  significant  differences  in 
the  spectral  shapes  of  the  observed  pixel  vectors  which  start  at  this  wavelength.  This 
can  be  interpreted  to  mean  that  the  scene  consists  of  both  vegetation  and  non-vegetation 
pixel  vectors.  If  the  pixel  vectors  did  not  posses  significantly  different  shapes,  then  this 
feature  would  not  have  manifested  itself.  The  Davis  Monthan  scene  is  comprised 
predominantly  of  a  sandy  background,  and  as  a  result,  the  area  between  bands  one  and 


100  appears  to  have  high  covariances  and  correlation  coefficients  without  the  sharp 


transition  at  band  55.  The  blocky  appearance  in  the  first  100  bands,  evident  when 
vegetation  was  present,  is  now  not  present. 

While  these  observations  are  cursory,  they  demonstrate  how  the  statistics  of  the 
scene  reveal  a  great  deal  of  useful  information.  A  more  refined  study  of  scene  statistics, 
such  as  that  pursued  by  Brower,  et.  al.,  (1996),  finds  that  the  scene  statistics  can  be  used 
to  differentiate  urban  and  rural  areas.  This  idea  can  be  carried  further  to  the  problem  of 
differentiating  small  man-made  objects  in  a  natural  background  but  is  beyond  the  scope 
of  this  study.  Considered  independently,  the  scene  statistics  are  interesting  in  that  they 
provide  further  perspective  and  understanding  into  the  nature  of  the  scene.  More 
importantly,  they  bring  us  closer  to  the  invariant  display  problem  by  setting  the  stage  for 
an  understanding  of  the  techniques  which  use  statistics  to  describe  the  background. 

3.  Related  Signal  Processing  and  Linear  Algebra  Concepts 

a.  Linear  Transformations  of  Random  Variables 

The  fundamental  basis  of  the  hyperspectral  image  analysis  technique 
utilized  by  this  study  is  that  of  linear  transformations.  Our  statistical  definitions  of  the 
data  using  the  covariance  matrix  and  its  standardized  form,  the  correlation  matrix,  are 
central  to  an  invariant  display  strategy.  Understanding  the  effect  of  a  linear 
transformation  on  these  matrices  is  also  important  and  will  be  addressed. 

A  linear  transformation  of  a  vector  x  into  a  vector  y  is  accomplished  by 
the  matrix  A  in  the  relation  y  =  Ax.  Figure  2.10  illustrates  this  concept  using  two- 
dimensional  vectors. 


25 


X 


Figure  2.10:  Linear  Transformation  of  a  Two-dimensional  Vector. 

The  transformation  matrix  A  rotates  and  scales  the  vector  x  into  the  new 
vector  y.  Since  second  order  moment  matrices  of  random  vectors  are  symmetric,  we  may 
assume  that  A  is  symmetric.  The  expectation  operator  is  linear,  which  implies  that  the 
mean  of  the  random  vector  x  is  transformed  as: 

E{y}  =E{Ax}  =AE{x}  (2.7) 

which  can  be  restated  as  my  =  Amx,  where  the  subscript  on  the  mean  vector  denotes 
which  random  vector  the  mean  vector  represents.  Similarly,  using  the  definition  of  the 
second  order  moment,  the  covariance  matrix  is  transformed  by  the  matrix  A  so  that 
(Therrien,  1992,  p.  45) 

Ly'AS.A1  (2.8) 

A  particularly  useful  transformation  is  one  which  transforms  a  random 
vector  x  into  another  random  vector  y  whose  kth  and  /th  components  have  the  property  of 
statistical  orthogonality  such  that  (Therrien,  1992,  p.  50): 

E{ykyi}  =0  k*l.  (2.9) 

The  statistically  orthogonal  or  uncorrelated  random  variables  which  result  from  such  a 

transformation  cause  the  transformed  data  covariance  matrix  to  be  diagonal.  The  means 
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of  achieving  such  a  transformation  that  diagonalizes  the  covariance  matrix  is  provided  by 
the  concept  of  eigenvectors  and  eigenvalues. 

b.  Eigenvectors  and  Eigenvalues 

The  eigenvalues  of  a  L  x  L  matrix  A  are  the  scalar  roots  of  its 
characteristic  equation,  and  are  denoted  as  {Xi,...  A,l}.  The  nonzero  vectors,  {ei,...,eL} 
which  satisfy  the  equation 

Aek  =  ^kCk  (2.10) 

are  called  the  eigenvectors  of  A.  An  eigenvector  defines  a  one-dimensional  subspace  that 
is  invariant  with  respect  to  premultiplication  by  A  (Golub  and  Van  Loan,  1983,  P.  190). 
In  applying  the  above  definitions  of  the  eigenvalue  and  eigenvector  to  the  L  x  L 
covariance  matrix,  we  obtain 

=  ^-k  ®k-  (2.11) 

The  covariance  matrix  in  this  relation  may  be  viewed  as  a  linear  transformation  which 
maps  the  eigenvector  ek  into  a  scaled  version  of  itself  (Therrien,  1992,  p.  50).  Because  of 
the  symmetry  of  the  real  covariance  matrix,  the  L  eigenvalues  are  guaranteed  to  be  non¬ 
negative  and  real  (Searle,  1982).  It  is  also  possible  to  find  L  orthonormal  eigenvectors 
{ei,...,eL},  that  correspond  to  the  L  eigenvalues  (Therrien,  1992,  p.  50)  that  satisfy 

e,V<5,.  (2.12) 
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c.  Unitary  Transformations 

Suppose  that  the  eigenvectors  of  the  L  x  L  covariance  matrix  Xx  are 
packed  into  a  matrix  E  as  column  vectors.  Then,  because  of  the  orthonormality  of  the 
eigenvectors,  the  matrix  E  transforms  the  covariance  matrix  in  the  following  manner: 

V  eTx  ->1  [t  tl  [4  O' 

EtZxE  =  i  e,  •••  =  A>  (2-13> 

< —  ef  — >  -i  ^  0  A/ 

following  the  rules  of  linear  transformations  (Therrien,  1992,  p.  45).  The  transformation 
matrix  ET  defines  a  linear  transformation  of  a  random  vector  x  into  a  random  vector  y,  by 
the  relation 

y=ETx  (2.14) 

in  which  the  covariance  matrix  of  y  is  a  diagonal  matrix  represented  by  A.  This 
diagonalization  of  the  covariance  matrix  Zx  is  another  manner  of  stating  that  the 
components  of  random  vector  y  are  now  uncorrelated  since  all  off-diagonal  elements  of 
A  are  zero.  The  orthonormal  columns  of  E  imply  that  the  transformation  matrix  ET 
represents  a  unitary  transformation  defined  by  (Therrien,  1992,  p.  51) 

ETE=EET=I.  (2.15) 

d.  A  Geometric  Interpretation  of  the  Unitary  Transform 

If  we  assume  that  our  data  has  a  Gaussian  distribution,  then  we  can 
describe  its  probability  density  function  (pdf)  with  a  family  of  ellipsoids  as: 


(x-mx)T  Xx'^x-nix)  =  constant 
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(2.16) 


Because  the  matrix  E  is  orthonormal,  the  implication  is  that  the  eigenvectors  of  Ex  are  the 
same  as  those  of  its  inverse,  and  the  eigenvalues  of  Ex’1  are  simply  the  reciprocals  of 
those  of  Ex  (Jolliffe,  1986,  p.  14).  Thus,  the  inverse  transformation  may  be  written  as 

x=ETy  (2.17) 

and  the  equation  defining  the  contours  of  constant  density  may  be  rewritten  as: 

x  t  t  .  -.Til.-  -v  V™1  1  ri  —  ky\k  1 1  Vk  17lk  I  ^ 

(x-mx)  E  A  Et  (x-mx)  =  (y-my)TA  (y-my)  =  Y  — - — - -  =  constant  =  C  (2.1 8) 

k= i  h 

which  is  the  equation  for  an  ellipse  with  the  principal  axes  of  the  ellipse  being  aligned 
with  the  eigenvectors  and  the  magnitudes  proportional  to  kk  (Jolliffe,  1986,  P.  19).  This 
geometrically  illustrates  the  role  that  eigenvalues  and  eigenvectors  play  in  the  unitary 
transform.  Figure  2.1 1  shows  that  the  unitary  transformation  is  equivalent  to  a  rotation  of 
the  coordinate  axes.  The  tilt  of  the  ellipse  with  respect  to  the  original  coordinate  system  is 
indicative  of  the  fact  that  correlation  exists  between  the  original  vector  components 
(Therrien,  1992,  p.  59).  In  the  new  coordinate  system  defined  by  the  unitary 
transformation,  the  axes  of  the  ellipse  are  parallel  to  the  new  axes,  showing  that  the 
vector  components  are  indeed  uncorrelated  in  this  coordinate  system.  Although  the 
assumption  was  made  that  the  data  was  Gaussian,  this  concept  of  two-dimensional 
ellipsoids  is  a  useful  one  in  understanding  the  workings  of  the  transformation  even  for 
non-Gaussian  data.  In  this  context,  the  scatter  plots  of  the  Landsat  data  are  useful  in 
portraying  a  rough  idea  of  the  distribution  of  the  probability  density  function  of  the 
random  vectors. 


29 


Region  of  scotter  of 
pixel  vectors 
(highly  corrected 
ir.  x  space) 


spoce  axes 
{ in  which  Polo  is 
not  correlated) 


X\ 


x  spoce  axes 


Figure  2. 1 1 :  The  Unitary  Transformation  as  a  Rotation  of  Axes.  From  Richards,  1993. 


4.  Principal  Component  Analysis 
a.  Description 

Principal  components  analysis  (PCA)  as  applied  in  multispectral  and 
hyperspectral  remote  sensing  is  an  analytical  technique  based  on  the  linear  transformation 
of  the  observed  spectral  axes  to  a  new  coordinate  system  in  which  spectral  variability  is 
maximized.  The  impetus  for  such  a  transformation  is  the  high  correlation  that  exists 
between  adjacent  bands  in  spectral  imagery.  The  spectral  overlap  of  the  sensors  and  the 
wide  frequency  range  of  the  energy  reflected  from  the  ground  account  for  this  high 
correlation  (Rao  and  Bhargava,  1996,  p.  385).  This  implies  that  a  great  deal  of  spectral 
redundancy  exists  in  the  data.  The  principal  components  transformation  decorrelates  the 
information  in  the  original  bands  and  allows  the  significant  information  content  of  the 
scene  to  be  represented  by  a  smaller  number  of  linear  combinations  of  the  original  bands 
called  principal  components.  The  transformation  effected  by  the  PCA  is  a  unitary 
transformation  and  is  graphically  depicted  in  Figure  2.12  as  operating  on  observed  pixel 
vectors  to  produce  new  pixel  vectors  with  uncorrelated  components. 
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The  immediate  applications  of  the  principal  components  transformation 
for  this  study  are  data  compression  and  information  extraction.  In  the  problem  of  target 
detection  and  development  of  an  invariant  display  strategy,  the  latter  is  of  considerable 
interest.  PCA  techniques  are  based  exclusively  on  the  statistics  of  the  observed  variables, 
requiring  no  a  priori  deterministic  information  about  the  variables  in  the  image.  This  will 
allow  for  a  methodology  whereby  no  preprocessing  need  be  performed  prior  to  displaying 
the  data  utilizing  “global”  a  priori  knowledge. 
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Figure  2.12:  PC  Transformation  Depicted  as  a  Linear  Transformation. 


b.  Background 

Principal  components  analysis  is  an  extremely  versatile  tool  in  the  analysis 
of  multidimensional  data.  In  tracing  the  historical  roots  of  this  technique,  it  is  clear  that  it 
is  based  upon  ideas  drawn  from  the  fields  of  statistics  and  linear  algebra.  The 
mathematical  underpinnings  of  PCA  deal  with  the  diagonalization  of  the  covariance 
matrix  via  eigendecomposition  of  the  data  by  unitary  transform  and  serves  as  a  bridge 
between  matrix  algebra  and  stochastic  processes  (Haykin,  1996).  The  wide  applicability 
of  PCA  is  due  to  the  fact  that  it  assumes  a  stochastic  outlook  of  the  data,  which  is 
fundamental  to  the  analysis  of  data  in  many  scientific  disciplines.  We  will  investigate  the 
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views  of  two  disciplines  which  employ  PCA  to  better  understand  some  of  the  mechanics 
of  this  seemingly  simple  transformation.  The  two  views  are  those  of  multivariate  data 
analysis  and  signal  processing.  A  thorough  understanding  of  the  ideas  that  motivate  the 
PCA  will  assist  in  understanding  why  it  is  such  a  commonly  used  technique  in  remotely 
sensed  imagery  analysis,  and  why  this  strategy  is  most  appropriately  applied  to  an 
invariant  display  strategy. 

(1)  Multivariate  Data  Analysis  View.  PCA  was  described  by 
Pearson  in  1901  and  introduced  as  the  Hotelling  transform  in  1933  by  Hotelling  for 
application  in  educational  psychology  (Singh  and  Harrison,  1985,  p.  884).  Hotelling’s 
goal  was  to  find  a  fundamental  set  of  independent  variables  of  smaller  dimensionality 
than  the  observations  that  could  be  used  to  determine  the  underlying  nature  of  the 
observed  variables  (Hotelling,  1933,  P.  417).  In  many  scientific  experiments,  the  large 
number  of  variables  makes  the  problem  of  determining  the  relative  importance  of  specific 
variables  intractable.  Hotelling’s  method  makes  the  problem  manageable  by  discarding 
the  linear  combinations  of  variables  with  small  variances,  and  studying  only  those  linear 
combinations  with  large  variances.  Since  the  important  information  in  the  data  is  usually 
contained  in  the  deviation  of  the  variables  from  a  mean  value,  it  is  logical  to  seek  a 
transform  which  provides  a  convenient  means  of  identifying  the  combinations  of 
variables  most  responsible  for  the  variances  (Anderson,  1984,  p.  451).  The  linear 
combination  of  the  original  variables  which  behave  sufficiently  similarly  are  combined 
into  new  variables  called  principal  components.  In  this  context,  principal  components 
analysis  studies  the  covariance  relationships  within  a  data  set  by  investigating  the  number 
of  independent  variables,  and  identifies  the  natural  associations  of  the  variables. 
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Mathematically  represented,  each  principal  component  is  a  scalar 
formed  by  a  linear  combination  of  the  elements  of  the  observed  random  vector  x,  where 
each  vector  component  corresponds  to  a  random  variable.  The  principal  components  are 
constructed  in  such  a  manner  as  to  be  uncorrelated  with  all  other  principal  components 
and  ordered  so  that  variance  is  maximized  (Jolliffe,  1986,  p.  2).  The  kth  principal 
component  is  obtained  by  multiplying  the  transposed  k‘h  eigenvector  of  Zx  by  the  data 
vector  x,  as  depicted  in  the  equation 

yk=euTx.  (2.19) 

The  kth  principal  component  is  also  called  a  score,  and  the 
components  of  the  eigenvector  are  called  loadings  because  they  determine  the 
contribution  of  each  original  variable  to  the  principal  component.  Generalizing  the  scalar 
result  of  Equation  2.18  to  a  vector  result: 

y  =  Etx  (2.20) 

we  obtain  a  vector  of  L  principal  components  when  we  take  the  product  of  all  of  the 
transposed  eigenvectors  of  Ex  and  the  data  vector,  x. 

While  the  property  of  the  unitary  transform  to  produce  new 
uncorrelated  variables  has  been  previously  discussed,  the  property  of  the  unitary 
transform  to  maximize  the  variance,  which  is  central  to  the  PCA,  will  be  discussed 
further.  The  best  illustration  of  this  property  is  the  algebraic  derivation  of  the  PCA.  The 
goal  is  to  maximize  the  variance  of  the  first  principal  component,  denoted  as  VAR[yi]  or 
VAR[eiTx].  By  the  definition  of  variance  as  a  second  order  moment,  this  is  equivalent 
to  maximizing  eiTXx  ei,  where  the  eigenvectors  are  orthonormal,  so  that  eiTei  =1.  The 
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method  of  LaGrange  multipliers  is  employed  so  that  the  expression  to  be  maximized  is 
differentiated  with  respect  to  the  eigenvector  and  set  equal  to  zero  as 

4-  [eiTXx  e,-M  e/erl)]  =  0  =>  (Ex-/J)  e,  =  0.  (2.21) 

oei 

In  Equation  2.21,  X  is  a  Langrangian  multiplier  in  the  left  hand 
expression  and  corresponds  to  the  largest  eigenvalue  of  Ex  in  the  right  hand  expression, 
and  ei  is  the  eigenvector  corresponding  to  the  largest  eigenvalue  (Jolliffe,  1986,  p.  4). 
Thus,  the  eigenvalues  of  Ex  represent  the  variances  of  the  principal  components,  and  are 
ordered  from  largest  to  smallest  magnitude.  If  the  original  variables  have  significant 
linear  intercorrelations,  as  spectral  imagery  does,  then  the  first  few  principal  components 
account  for  a  large  part  of  the  total  variance.  (Singh  and  Harrison,  1985,  p.  883).  Figure 
2.13  depicts  the  eigenvalues  and  associated  variance  percentage  for  a  typical  Davis- 
Monthan  scene. 


:  Principle  .Comport^*;  numhor 

Figure  2.13:  Eigenvalue  Plot  of  Davis-Monthan.  Note  the  large  fraction  of  the  overall 

O  w  w 

variance  within  first  few  PCs. 

(2)  Signal  Processing  View.  In  the  analysis  of  random  signals, 
the  key  is  to  have  a  set  of  basis  functions  that  make  the  components  of  the  signal 
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statistically  orthogonal  or  uncorrelated  (Therrien,  1992,  p.  173).  The  Karhunen-Loeve 
Transform  (KLT)  was  introduced  in  1947  for  the  analysis  of  continuous  random 
processes,  and  is  developed  here  in  its  discrete  form,  the  DKLT.  It  is  the  same  unitary 
transform  previously  presented,  but  is  posed  to  solve  the  problem  from  a  different 
perspective.  The  motivation  for  the  DKLT  is  actually  an  expansion,  best  seen  by  Figure 
2.14,  which  shows  a  discrete  observed  signal  as  a  weighted  sum  of  basis  functions,  which 
are  in  fact  the  eigenvectors  of  the  covariance  matrix.  The  observed  pixel  vector  spectrum 
may  be  thought  of  as  a  discrete  signal,  indicated  by  the  square  brackets  in  the  notation  of 
Figure  2.14.  Whereas  in  the  PCA  approach  the  original  variables  are  weighted  by 
eigenvector  components  to  form  principal  components,  in  the  DKLT  the  eigenvector 
basis  functions,  {ei,...,eN},  are  weighted  by  the  principal  component  scores,  {yi,...,yN},  to 
form  a  representation  of  the  observation.  The  DKLT  has  an  optimal  representation 
property  in  that  it  is  the  most  efficient  representation  of  the  observed  random  process  if 
the  expansion  is  truncated  to  use  fewer  than  N  orthonormal  basis  functions.  This  makes  it 
very  attractive  from  a  compression  perspective,  and  explains  the  popularity  of  DKLT  as  a 
compression  scheme. 

Another  important  property  associated  with  the  DKLT  is  the 
equivalence  between  the  total  variance  in  the  vector  x  and  the  sum  of  the  associated 
eigenvalues.  This  property  is  mathematically  stated  by  the  equation 

t,  c2i  =  £  K  (2.22) 

1=1  ;=] 


35 


Figure  2.14:  The  Karhunen-  Loeve  Expansion  in  Terms  of  Discrete  Signals.  After 

Therrien,  1992,  p.  175. 

In  Equation  2.22,  a2  is  the  variance  of  the  original  variables  and  X\  is  the  eigenvalue 
representing  the  variances  of  the  transformed  variables  and  the  index  i  ranges  over  all  L 
bands.  This  property  only  holds  for  the  orthonormal  vectors  which  are  eigenvectors  of 
Ex  and  not  for  other  orthonormal  basis  sets  of  vectors  (Kapur,  1989,  p.  501). 

When  a  representation  of  a  signal  is  formed  by  using  fewer  than  L 
basis  functions,  the  mean  square  error  (MSE)  is  a  means  of  quantifying  how  well  the 
representation  corresponds  to  the  original  signal  by  measuring  the  power  of  the  difference 
between  the  representation  and  original  signals.  The  MSE  incurred  by  truncating  the 
representation  is  equal  to  the  sum  of  the  eigenvalues  of  the  covariance  matrix  that  were 
left  out  of  the  representation.  (Therrien,  1992,  p.  179)  Conversely,  the  largest 
eigenvalues  and  their  corresponding  eigenvectors  can  be  used  to  represent  the  intrinsic 
dimensionality  of  the  signal.  This  corresponds  to  the  number  of  dimensions  that  would 
be  needed  to  represent  the  signal  to  some  predetermined  MSE. 
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In  signal  processing  applications,  the  DKLT  is  a  means  of 
compressing  data  by  representing  it  with  a  truncated  number  of  eigenvectors.  It  is  also  an 
optimum  way  of  detecting  a  signal  in  noise  and  works  particularly  well  for  the  detection 
of  narrowband  signals.  Since  a  significant  portion  of  the  signal  energy  lies  in  the 
direction  of  the  first  few  eigenvectors,  those  eigenvectors  can  be  said  to  define  a  subspace 
for  the  signal  and  all  other  eigenvectors  define  the  subspace  for  the  noise.  This  simple 
example  is  the  basis  for  several  high  resolution  methods  of  spectral  estimation  used  to 
detect  sinusoids  in  noise  (Scharf,  1991,  p.  483). 

c.  Operation 

PCA  uses  the  eigenvectors  of  Ex  to  assemble  a  unitary  transformation 
matrix  which,  when  applied  to  each  pixel  vector,  transforms  the  original  pixel  vector  into 
a  new  vector  with  uncorrelated  components  ordered  by  variance.  The  eigenvector 
components  act  as  weights  in  the  linear  combination  of  the  original  band  brightness 
values  that  form  the  principal  components  (Richards,  1993).  The  new  image  associated 
with  each  eigenvector  is  referred  to  as  the  principal  component  image.  The  principal 
component  images  are  ordered  from  largest  to  smallest  in  terms  of  variance,  and  are 
revealing  in  their  composition.  As  Singh  and  Harrison  (1985)  point  out,  it  must  be  kept 
in  mind  that  the  PCA  is  an  exploratory  technique  that  constructs  new  variables  called  the 
principal  components  (PCs).  These  new  variables  are  artificial  and  do  not  necessarily 
have  a  physical  meaning,  as  they  represent  linear  combinations  of  the  observed  variables  4 
and  cannot  themselves  be  observed  directly,  but  they  should  be  related  because  the  first 
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eigenvector  in  spectral  imaging  is  a  representation  of  the  mean  solar  energy  of  the  scene 
and  the  next  few  eigenvectors  deviate  as  the  variance  changes. 

In  traditional  application  of  PCA,  the  hope  is  that  the  transformation  will 
enhance  the  contrast  of  the  image  by  grouping  like  areas  together  to  such  an  extent  that 
objects  or  areas  of  interest  can  be  more  readily  discriminated  in  the  principal  component 
images.  Jenson  and  Waltz  (1979)  give  an  analogy  which  clearly  explains  the  role  of  PCA 
in  the  traditional  application.  They  imagine  a  tube  filled  with  ping  pong  balls.  Looking 
at  the  tube  directly  from  an  end,  only  one  ball  is  apparent,  the  same  way  that  the  original 
spectral  image  is.  Turning  the  tube  sideways,  all  of  the  balls  become  visible  (Jenson  and 
Waltz,  1979,  p.  341).  PCA  has  the  effect  of  decorrelating  the  data  so  that  independent 
sources  of  spectral  features  can  be  discerned. 

Though  PCA  assumes  no  a  priori  knowledge  of  the  scene,  PCA  as 
described  here  depends  intrinsically  on  the  scene  because  scene-specific  features  will 
dictate  the  shape  of  the  eigenvalues.  Nevertheless,  certain  general  observations  can  be 
made  regarding  the  PCA  and  an  associated  physical  meaning  without  any  knowledge  of 
the  scene.  The  following  two  figures  highlight  these  observations.  Figure  2.15  shows  the 
first  20  PC  images  of  the  Jasper  Ridge  AVIRIS  scene.  For  a  non-negative  symmetric 
matrix,  the  first  eigenvector  is  all  positive.  A  weighted  sum  loosely  corresponds  to  the 
average  spectral  radiance.  All  other  eigenvectors  must  have  at  least  one  sign  change  to 
be  orthogonal  to  the  first  eigenvector.  This  is  due  to  the  fact  that  in  forming  the  first 
i  principal  component  image,  the  first  eigenvector  has  heavily  weighted  the  original  bands 
possessing  the  most  variance.  Thus,  the  first  principal  component  image  will  have  a 
variance  that  is  larger  than  that  of  any  single  original  band  image. 
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Figure  2. 1 5:  First  20  PC  Images  of  Jasper  Ridge  Scene 
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It  is  the  weighted  sum  of  the  overall  response  level  in  all  original  band  images.  The 
second  principal  component  image  is  the  difference  between  certain  original  band 
images.  As  the  principal  component  image  number  increases,  the  PC  image  holds  less  of 
the  data  variance.  This  effect  manifests  itself  as  a  rough  decrease  in  image  quality  with 
increasing  PC  image  number.  In  Figure  2.15,  the  fact  that  the  first  twelve  PC  images 
contain  relatively  clear  details  of  the  scene  indicates  that  these  PC  images  together 
account  for  the  majority  of  the  overall  spectral  variance  in  the  scene.  It  is  interesting  to 
note  that  when  using  PCA,  the  higher  numbered  PC  images  sometimes  contain  a  large 
amount  of  local  detail.  Though  it  is  tempting  to  dismiss  the  higher  numbered  PC  images 
as  not  containing  any  useful  information  because  they  have  low  variance,  one  must  keep 
in  mind  that  the  covariance  matrix  on  which  PCA  is  based  is  a  global  measure  of  the 
variability  of  the  original  image  (Richards,  1986,  p.  138).  This  implies  that  small  areas  of 
local  detail  will  not  appear  until  higher  PC  images  since  they  do  not  make  a  statistically 
significant  impact  on  the  covariance  matrix.  Another  point  that  is  noteworthy  is  the  issue 
of  SNR.  PCA  orders  PC  images  based  on  total  variability.  It  does  not  differentiate 
between  the  variability  representing  desirable  information  and  the  variability  representing 
undesirable  noise  (Jenson  and  Waltz,  1979,  p.  338).  Ready  and  Wintz  (1973)  argue  that 
PCA  improves  the  SNR  of  the  spectral  image.  Their  definition  of  noise  is  additive  white 
Gaussian  noise  with  a  variance  of  an2.  The  SNR  of  the  original  image  is 

(SNR)X  =  (2.23) 

which  is  the  maximum  original  band  variance  over  the  noise  variance.  The  SNR  of  the 

PC  images  is 
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(SNR),  =  4  (2-24) 

^  n 

which  is  the  largest  eigenvalue  (or  new  variance)  over  the  noise  variance.  Since  the  first 
eigenvalue  always  has  a  greater  variance  than  any  of  the  original  bands,  the  improvement 
in  SNR  is 


ASNR  = 


(SNR), 

(SNR)X 


& 

2 

u  jtmax 


(2.25) 


and  will  be  greater  than  one.  The  SNR  improvement  applies  as  long  as  the  variance  of 
the  eigenvalue  exceeds  that  of  the  original  bands.  The  diminishing  SNR  manifests  itself 
in  Figure  2.15  as  an  increased  fuzziness  of  the  image  that  begins  to  appear  around  the 
ninth  PC  image.  Figure  2.16  further  accentuates  the  above  observations  using  the  Cuprite 
radiance  and  reflectance  images.  The  first  ten  PC  images  are  shown  for  each  data  set. 
The  same  general  trends  noted  for  Figure  2.15  appear.  The  first  few  PC  images  offer  the 
greatest  amount  of  contrast.  The  effects  of  noise  become  apparent  sooner  in  decreased 
image  quality  with  the  reflectance  data  than  the  radiance  data. 

A  traditional  means  of  presenting  PCA  images  is  to  form  a  false  color 
composite  image  consisting  of  the  first,  second,  and  third  PC  images  as  the  red,  green 
,and  blue  colors.  Figure  2.17  presents  such  false  color  images  for  the  Jasper  Ridge  and 
Cuprite  radiance  PC  images.  This  mode  of  presentation  captures  the  major  sources  of 
spectral  variability  in  one  image.  The  levels  of  detail  and  contrast  apparent  in  the 
composite  image  are  interesting  to  compare  with  the  original  image  shown  in  Figure  2.3. 
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Figure  2.16:  First  10  PC  Images  of  Cuprite  Radiance  and  Reflectance  Scenes. 

(Decreasing  image  quality  as  variance  decreases  for  both  radiance  and 
reflectance.  Noise  effects  more  apparent  earlier  in  reflectance.) 


Jasper  Ridge  Cuprite 


Figure  2.17:  False  Color  Images  of  Jasper  Ridge  and  Cuprite  Radiance  PC  Scenes. 


A  facet  of  PCA  rarely  mentioned  in  the  pertinent  literature  on  PCA  is  the 
characterization  of  the  original  and  PC  images  using  the  behavior  of  the  eigenvalues  and 
eigenvectors.  The  behavior  of  the  eigenvalues  and  eigenvectors  will  be  investigated 
more  fully  later  in  the  study  because  these  attributes  form  an  important  part  of  analyzing 
the  scene  information  content.  In  spectral  images,  the  typical  trend  in  the  eigenvalue 
magnitude  is  that  a  very  small  number  of  eigenvalues  have  a  disproportionately  large 
magnitude  compared  to  the  others.  The  obvious  reason  for  this  distinct  grouping  of 
eigenvalues  is  that  the  data  in  the  original  image  exhibits  a  high  degree  of  interband 
correlation  and  the  magnitude  of  the  eigenvalues  reflects  the  degree  of  redundancy  in  the 
data.  (Richards,  1986,  p.  137).  Phrased  another  way,  the  intrinsic  dimensionality,  which 
is  represented  by  the  number  of  large  eigenvalues  of  the  data,  is  much  smaller  than  the 
original  number  of  dimensions.  This  is  good  from  a  compression  view,  since  the  image 
variance  will  be  accounted  for  by  a  very  small  number  of  principal  components.  From  a 
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strict  analysis  viewpoint,  it  does  not  reveal  as  much  information.  If  the  problem  were 
that  of  a  narrow-band  signal  embedded  in  noise,  then  the  large  eigenvalues  would  be 
associated  with  the  signal.  In  the  hyperspectral  imagery  analysis  problem,  the  spectrum 
associated  with  a  target  is  not  narrowband,  and  hence  is  not  clearly  delineated  from  the 
eigenvalues  of  the  background  and  other  interfering  signatures.  The  eigenvalues  can  be 
divided  into  a  primary  and  a  secondary  set,  where  the  secondary-  set  roughly  corresponds 
to  the  effects  of  instrumentation  noise  (Smith,  Johnson,  and  Adams,  1985,  p.  C798).  The 
primary  set  corresponds  to  the  linear  combinations  of  original  bands  that  cause  the  most 
variance  in  the  scene.  Figure  2.18  illustrates  the  first  ten  eigenvalues  of  the  Jasper  Ridge 
and  Cuprite  radiance  images  together.  The  v-axis  of  this  plot  is  normalized  and 
represents  the  variance  of  each  PC  image.  The  Jasper  Ridge  PC  images  exhibit  slightly 
higher  variances  (eigenvalues)  than  the  Cuprite  scene.  The  quality  of  the  first  eight  PC 
images  noted  in  Figure  2.15  corresponds  to  the  steeper  initial  slope  of  the  detailed 
eigenvalue  plot. 


Jasper  Ridge  and  Cuprite  PC  Normalized  Eigenvalues 
(Radiance) 
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Figure  2.1 8:  Eigenvalue  Behavior  of  the  Jasper  Ridge  and  Cuprite  Radiance  Scene 

Covariance  Matrices 
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Likewise,  the  first  five  images  of  the  Cuprite  radiance  PC  images  in  Figure  2.16  are 
reflected  in  the  steeper  slope  of  the  first  five  eigenvalues  of  Figure  2.178. 

Figure  2.19  shows  the  eigenvalues  of  the  Cuprite  reflectance  image 
compared  to  the  radiance.  The  sharp  drop  in  the  slope  of  the  eigenvalues  is  paralleled  by 
the  drop  in  image  quality  noted  in  Figure  2.15  after  the  second  PC  image.  In  general,  the 
AVIRIS  reflectance  eigenvalues  are  lower  in  magnitude  than  those  of  the  radiance. 


Cuprite  Radiance  and  Reflectance  PC  Normalized 
Eigenvalues 
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Figure  2.19:  Eigenvalue  Behavior  of  the  Cuprite  Radiance  and  Reflectance  Scene 

Covariance  Matrix. 

The  above  results  clearly  indicate  that  the  variance  for  the  transformed 
data  are  concentrated  in  the  first  few  PC  bands,  indicating  that  the  dimensionality  is  on 
the  order  of  10  vice  200-225.  It  is  important  to  note  that  the  variance  of  the  original  data 
is  equal  to  that  of  the  transformed  data.  This  property'  shows  that  the  PC  transformation 
merely  redistributes  the  concentration  of  variance  in  the  bands  of  a  spectral  image  so  that 
the  higher  variances  occur  in  the  first  PC  bands.  (Stefanou,  1997) 
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The  eigenvector  behavior  is  less  clear  than  that  of  the  eigenvalues.  The 
eigenvectors  form  the  bases  of  the  principal  components  subspaces.  Physically,  the 
eigenvectors  correspond  to  the  principal  independent  sources  of  spectral  variation.  As 
such,  the  wavelengths  at  which  the  maxima  and  minima  of  the  eigenvectors  occur 
account  for  the  wavelengths  that  contribute  the  most  to  a  particular  independent  axis  of 
variation  (Smith,  Johnson,  and  Adams,  1985,  p.  C808).  A  signal  processing 
interpretation  of  the  eigenvectors  is  that  the  eigenvectors  act  as  band  pass  filters  that 
transform  an  input  observed  spectrum  into  a  new  spectrum  that  has  fewer  data  points 
(Johnson,  Smith,  and  Adams,  1985).  This  interpretation  is  analogous  to  the  optimum 
representation  property  of  the  DKLT.  It  can  be  further  shown  that  the  eigenvectors  of 
reflectance  data  will  tend  to  have  a  more  distributed  appearance.  The  effect  of  the  sun  on 
the  low  numbered  original  bands  can  be  mitigated  in  the  conversion  to  reflectance, 
although  this  is  not  always  desired,  especially  for  real-time  processing.  Further 
examination  of  eigenvector  behavior  emphasizes  the  correlation  between  the  eigenvectors 
and  variance  occurring  in  the  original  image  bands  and  it  can  be  shown  that  the 
eigenvectors  of  the  PC  transform  tend  to  emphasize  those  original  bands  that  contain  the 
most  variance  with  larger  weights  and  inclusion  in  the  low  numbered  eigenvectors. 
(Stefanou,  1997) 

The  PCA  technique  has  been  examined  from  the  perspective  of  its 
results  and  the  significance  of  its  inner  workings.  Key  points  in  PCA  analysis  are: 

•  In  general,  PCA  provides  an  analysis  of  the  data  which  guarantees  an  output  set  of 
images  ordered  by  variance. 

•  Assuming  white  noise,  it  improves  the  SNR  in  the  transformation  from  the 
original  image  cube  to  the  PC  images. 
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•  The  PC  images  accentuate  spectral  regions  of  high  variance.  However,  an  area  of 
local  detail  may  not  be  accentuated  by  a  PC  image  due  to  its  statistical 
insignificance. 

•  Because  the  variability  of  the  covariance  data  is  scale-dependent,  PCA  is  sensitive 
to  the  scaling  of  the  data  to  which  it  applied,  and  as  a  result,  the  PCA  of  radiance 
data  will  place  more  emphasis  on  the  visible  bands  due  to  the  sun  than  the  PCA  of 
reflectance  data. 

•  PCA  does  not  differentiate  between  noise  and  signal  variances  because  it  operates 
strictly  on  the  variance  of  the  observed  data. 

As  a  practical  note  in  the  implementation  of  PCA,  the  computation  of  the 

eigenvectors  and  eigenvalues  of  Zx  is  an  expensive  operation.  Specific  methods  from 

computational  linear  algebra  such  as  inverse  iteration,  QR  factorization,  and  singular 

value  decomposition  (SVD)  are  all  applicable  in  their  calculation.  (Watkins,  1991) 

The  previous  discussion  highlights  three  crucial  issues  in  development  of 

an  invariant  display  strategy  utilizing  principal  components.  Issue  number  one  is  the  fact 

that  the  first  PC  band  is  typically  a  representation  of  the  scene  average  brightness  and  is 

generally  dominated  by  solar  radiance,  but  it  can  be  affected  by  major  scene  constituents 

such  as  in  Jasper  Ridge.  The  second  issue  is  that  the  PC  transformation  outputs  a  set  of 

images  ordered  by  variance  with  the  2nd,  3rd,  ..Nth  PCs  dependant  on  the  specific  contents 

of  the  image  and  the  third  issue  is  that  PCA  is  sensitive  to  scaling  of  the  data  to  which  it 

applied.  Because  of  the  scaling  sensitivity,  the  PCA  of  radiance  data  will  place  more 

emphasis  on  the  visible  bands  due  to  the  sun  than  the  PCA  of  reflectance  data. 
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m.  PHYSICAL  VISION 


A.  DESCRIPTION 

The  typical  person  can  detect  light  with  a  wavelength  in  the  range  of  about  400 
nanometers  (violet)  to  about  700  nanometers  (red).  Our  visual  system  perceives  this 
range  of  light  wave  frequencies  as  a  smoothly  varying  rainbow  of  colors.  This  is  called 
the  visible  spectrum.  Figure  3.1  illustrates  the  visible  spectrum  approximately  as  a 
typical  human  eye  experiences  it. 


Figure  3.1:  Human  Visual  Spectrum.  (Scott,  1997) 


The  human  eye  has  a  lens  and  iris  diaphragm  which  serve  similar  functions  to  the 
corresponding  features  of  a  camera.  Other  than  this,  the  eye  is  quite  different  from  a 
camera.  Whereas  a  camera  has  a  flat  image  plane  where  the  resolution  and  spectral 
response  is  reasonable  constant  across  the  entire  plane,  the  eye  does  not.  The  human  eye 
also  provides  a  motion  sensor  system  with  nearly  1 80  degrees  horizontal  coverage.  The 
eye’s  peripheral  vision  system  only  supports  low  resolution  imaging  but  offers  an 
excellent  ability  to  detect  movement  through  a  wide  range  of  illumination  levels. 
Peripheral  vision  also  provides  very  little  color  information. 
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The  retina  is  a  thin  layer  of  nerve  cells  which  consists  of  light  sensor  cells  called 
rods  and  cones.  The  majority  of  the  eye's  inside  chamber  has  this  retina  layer,  accounting 
for  the  very  wide  angle  of  our  peripheral  vision.  Figure  3.2  is  an  illustration  of  the  cross 
section  of  the  human  eye. 


Figure  3.2:  Cross  Section  of  the  Human  Eye. 


The  rods  in  the  retina  are  long  and  slender  while  the  cones  are  generally  shorter 
and  thicker.  Other  than  the  physical  differences,  there  is  an  important  functional 
difference  in  that  the  rods  are  more  sensitive  to  light  than  the  cones.  Figure  3.3  depicts 
the  relative  sensitivity  of  rods  and  cones  as  a  function  of  illumination  wavelength. 


Figure  3.3:  Sensitivity  of  Rods  and  Cones.  (Pratt,  1991,  p.  25) 

50 


It  has  been  experimentally  determined  that  there  are  three  basic  types  of  cones  in 
the  retina.  (Wald,  1964)  These  cones  have  different  absorption  characteristics  as  a 
function  of  wavelength  with  peak  absorptions  in  the  red  (580  nm),  green  (540  nm)  and 
blue  (450  nm)  visible  spectrum.  Our  perception  of  color  is  determined  by  the 
combination  of  cones  are  excited  and  by  how  much.  Figure  3.4  illustrates  the  spectral 
sensitivity  of  the  typical  human  visual  system.  The  RGB  sensors  are  denoted  with  the 
Greek  letters  Rho  (red).  Gamma  (green)  and  Beta  (blue).  Human  vision  has  a  great  deal 
of  sensitivity  to  low  ambient  illumination  situations.  In  low  ambient  illumination,  the 
cones  contribute  little  or  no  sensitivity  and  imaging  is  primarily  accomplished  by  the 
rods. 


Figure  3.4:  Spectral  Absorption  Curves.  (Scott,  1997) 


The  sensitivity  curves  of  the  Rho,  Gamma  and  Beta  sensors  in  our  eyes  determine 
the  intensity  of  the  colors  we  perceive  for  each  of  wavelengths  in  the  visual  spectrum. 
Figure  3.5  is  an  approximation  of  the  visual  spectrum  illustration  adjusted  for  the 
sensitivity  curves  of  our  Rho,  Gamma  and  Beta  sensors. 
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Figure  3.5:  Spectral  Absorption  Curves.  (Scott,  1997) 


There  are  three  perceptual  definitions  of  light:  brightness,  hue,  and  saturation.  If 
we  observe  two  light  sources  with  the  same  general  spectral  shape,  the  source  with  the 
greater  intensity  will  generally  appear  to  be  perceptually  brighter.  Hue  is  the  attribute 
that  distinguishes  a  red  color  from  a  green  color  or  a  yellow  color.  Saturation  is  the 
attribute  that  distinguishes  a  spectral  light  from  a  pastel  light  within  the  same  hue. 

B.  COMPARISON  TO  PRINCIPAL  COMPONENTS 

Processing  of  color  within  the  human  eye  is  accomplished  through  an  achromatic 

channel  and  two  opponent-color  channels.  The  opponent-color  channels  are  the  red- 

green  opponent  and  the  blue-yellow  opponent  channel.  (Wyszecki  and  Stiles,  1967; 

Buchsbaum  and  Gottschalk,  1983)  The  A,  R-G,  and  B-Y  channels  are  formed  from  a 

principal  components  analysis,  are  statistically  uncorrelated,  and  therefore  make  up 

orthogonal  dimensions  in  a  3-D  color  space.  Furthermore,  it  has  been  shown  that  there 

are  two  fundamental  axes  within  color  space  comprised  of  a  R-G  plane  where  all  colors 

have  an  absence  of  yellowness  or  blueness  and  a  B-Y  plane  where  all  colors  have  an 

absence  of  redness  or  greenness.  The  intersection  of  these  two  planes  is  a  line  with 
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absence  of  all  color,  or  the  gray  line,  which  corresponds  to  the  achromatic  channel. 
(Krauskopf,  et  al.,  1982) 

We  can  use  the  concept  of  three  orthogonal  axes  to  develop  the  hue,  saturation, 
and  value  (H-S-V)  color  representation  system.  From  the  previous  section,  hue  indicates 
a  particular  color,  e.  g.  the  perceived  colors  of  red,  green,  blue,  etc.,  saturation  indicates 
the  purity  of  a  particular  hue,  e.  g.  S=1  denotes  a  pure  hue  while  S=0  denotes  absence  of 
color  (gray),  and  value  is  related  to  the  brightness  or  intensity  of  a  particular  color.  The 
perceptual  color  space  therefore  makes  up  a  cone,  with  the  A  axis  as  the  axis  of  rotation 
of  the  cone  and  the  R-G  and  B-Y  axes  transverse.  Hue  is  determined  by  computing  an 
angle  in  the  (red-green)  -  (blue-yellow)  plane,  and  saturation  is  determined  by  the  angle 
between  a  particular  point  in  color  space  and  the  gray  axis.  Figure  3.6  graphically  depicts 
this  conical  representation  and  its  associated  radial  visualization. 


Figure  3.6:  Perceptual  Color  Space. 


When  mapping  color  space  we  encounter  three  distinct  problems.  The  first  is  that 
color  space  is  nonlinear.  The  non-linearity  is  related  to  the  spectral  response  functions  of 
the  individual  photoreceptors.  The  second  problem  is  that  we  typically  map  red  next  to 
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violet  when  they  actually  appear  at  opposite  ends  of  the  spectrum.  This  can  be  seen  in 
Figure  3.1.  Thirdly,  as  intensity  is  increased,  the  hue  is  perceived  to  shift,  and  this  is  the 
issue  of  color  constancy.  (Brainard,  et  al.,  1993,  p.  165-170)  The  fourth  problem  is 
associated  with  the  color  planes.  It  has  been  demonstrated  that  while  red-green  and  blue- 
yellow  planes  exist,  other  similar  color-opponent  planes  do  not  seem  to  exist,  hence  the 
term  “cardinal  directions"  used  by  Krauskopf,  et  al.  This  cardinal  direction  scheme  will 
be  utilized  for  the  invariant  mapping  strategy  in  later  sections. 

C.  PSEUDOCOLOR  AND  OPPONENT  COLOR  MAPPING  STRATEGIES 

In  the  past,  pseudocolor  displays  have  utilized  a  mapping  strategy  whereby  the 
principal  components  were  directly  depicted  by  mapping  the  PCs  as  follows: 

Pi  ->•  Red  Pi  ->  Brightness 

P2  ->  Green  OR  P2  — >  Hue 

P3  Blue  P3  — »  Saturation 

Figure  3.7  is  a  mapping  of  the  first  three  PCs  into  (R,G,B)  of  a  scene  from  Davis- 
Monthan  AFB.  Although  this  method  does  depict  some  of  the  high  spatial  frequency 
information,  much  of  it  is  suppressed  and  there  is  an  apparent  smearing,  which  is  a  result 
of  how  the  observer  views  the  data.  These  methods  are  not  a  good  fit  to  human  vision. 

It  has  been  shown  that  for  humans,  the  achromatic  spectral  channel  accounts  for 
approximately  97%  of  color  vision,  while  the  R-G  and  B-Y  channels  account  for 
approximately  2%  and  1%  respectively.  (Buchsbaum  and  Gottschalk,  1983)  From  the 
previous  discussion  of  the  principal  components  transformation,  it  can  be  readily  seen 
that  the  first  principal  component  of  a  spatial  image  is  roughly  achromatic  in  that  it 
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samples  the  mean  illumination  distribution  and  can  therefore  be  viewed  as  the  intensity  or 
brightness  of  a  hyperspectral  image.  Continuing  along  this  reasoning,  we  note  that  the 
second  and  third  PCs  contain  significantly  less  scene  variance  and  subsequently  higher 
PCs  contain  even  lower  amounts.  From  this  we  may  be  able  to  conclude  that  we  can  map 
the  second  and  third  principal  components  into  the  (R/G)  -  (B/Y)  plane. 


PCI  -  Red,  PC2  -  Green,  PC3  -  Blue 


Figure  3.7:  Pseudocolor  Representation  of  Davis-Monthan  Scene  Obtained  by  Mapping 

the  First  Three  PCs  into  (R,G,B). 


Mapping  the  first  PC  into  the  achromatic  channel,  the  value  of  the  second  PC  into 
the  R-G  channel  and  the  value  of  the  third  PC  into  the  B-Y  channel  yields 


0  =  atan 


f  p  \ 

f  -»  H  (Hue) 
V27 


i 


P2+Ps 


Pi 


-»  S  (Saturation) 


(3.1) 


Pi  V  (Value) 

where  Pj  is  the  ith  PC.  (Tyo,  et.  al.,  2000) 

The  mapping  of  the  same  Davis-Monthan  scene  with  Equation  3.1  yields  Figure 
3.8  and  is  a  more  visually  pleasing  representation  because  pixels  that  don't  have  a 


significant  projection  onto  either  P2  or  P3  appear  desaturated  or  gray.  This  makes  the 
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image  easier  to  view  because  naturally  occurring  scenes  tend  to  be  largely  desaturated 
with  low  dimensionalities  in  the  visible  portion  of  the  spectrum.  (Buchsbaum  and 
Gottschalk,  1983) 


HSV  FROM  COVARIANCE.  COLOR  RESHAPED 


Figure  3.8:  Pseudocolor  Representation  of  Davis-Monthan  Obtained  with  Equation  3.1. 

The  mapping  strategy  in  Figure  3.8  was  designed  with  the  performance  of  the 
human  visual  system  in  mind  and  does  not  present  images  that  contain  large  regions  of 
highly  saturated  hues  that  vary  rapidly.  To  obtain  Figure  3.8  we  first  needed  to  adjust  the 
hue  representation  within  MATLAB.  MATLABs  built-in  HSV2RGB  function  maps  the 
colors  starting  with  red,  then  runs  through  yellow,  green,  blue,  and  then  back  to  red  again 
in  a  non-orthogonal  manner.  The  non-orthogonal  mapping  yields  incorrect  hue 
opponency,  i.e.  red  not  opposite  green.  Figure  3.9a  depicts  MATLABs  default  non- 
orthogonal  colorwheel  and  Figure  3.9b  is  the  colorwheel  reshaped  to  correspond  to  the 
correct  color  orthogonality. 

Default  wheel 


Reshaped  wheel 


Figure  3.9:  Hue  Wheels  Using  MATLAB  Default  and  Reshaped  Hue  Values. 
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Although  the  remapping  of  MATLAB’s  colorwheel  yields  the  correct 
orthogonality  between  the  colors,  it  does  not  produce  an  image  that  maps  the  materials 
within  the  scene  to  our  perception  of  that  material,  i.e.  vegetation  to  green,  water  to  blue, 
etc.  Utilizing  the  HYDICE  bands  of  the  original  data  corresponding  approximately  to  the 
peak  sensitivities  of  the  p,  y  and  p  receptors,  spectral  bands  150, 38  and  10  respectively, 
we  can  display  a  scene  that  depicts  approximately  how  we  would  perceive  that  scene  if 
viewed  directly.  Figure  3.10  is  a  Red-Green-Blue  image  corresponding  bands  150,  38 
and  10  that  accurately  portrays  the  golf  course  as  green  and  the  background  sandy  soil  as 
tan  to  slightly  red. 


True  color,  R-150,  G-38,  B-10  (Clipping) 


Figure  3.10:  RGB  Image  with  Original  Data  Bands  150, 38  and  10. 

Utilizing  the  RGB  mapping  strategy  of  Figure  3.10  and  knowledge  of  linear 
transformations  and  eigenvectors  from  chapter  two,  we  can  identify  a  3x3  set  of  RGB 
eigenvectors  for  the  individual  scene.  These  statistics  can  then  be  applied  to  the  RGB 
transformation  of  Equation  3. 1  to  produce  an  image  that  preserves  the  hue  of  the  primary 
source  of  variance  within  the  image.  (Figure  3.11)  This  mapping  strategy  retains  the 
image  display  advantage  achieved  with  Equation  3.1  and  also  allows  for  a 
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straightforward  method  of  mapping  this  scene  into  perceptual  colorspace  that  preserves 
the  expected  hues  for  major  scene  constituents. 


Transformed  and  Color  Rotated  Image 


Figure  3.11:  HSV  Image  Transformed  with  Scene  RGB  Data. 

Keeping  human  visual  perception  and  the  characteristics  of  PCs  in  mind,  it  is 
clear  that  if  a  general  set  of  PCs  can  be  identified,  a  color  mapping  strategy  can  be 
arranged  so  that  materials  are  presented  in  a  straightforward  manner,  i.e.  water  can 
always  be  mapped  to  blue,  etc.  As  a  wider  range  of  wavelengths  is  considered,  it  should 
be  expected  that  more  than  3  PCs  may  be  necessary  to  capture  an  equivalent  amount  of 
the  data  (99%  or  more).  The  next  sections  will  investigate  this  and  develop  a  coherent 
method  for  an  invariant  display  methodology. 
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IV.  DATA  ANALYSIS 


A.  CASE  STUDIES 

For  this  study,  the  Davis-Monthan  HYDICE  Collects  of  June  and  October  1 995 
were  utilized  for  analysis.  Figure  4.1  is  an  aerial  photograph  of  the  Davis-Monthan 
collect  area.  The  case  studies  are  subsets  of  Figure  4.1. 


B.  CASE  STUDY  ANALYSIS 


Analysis  on  the  data  sets  were  performed  utilizing  three  statistical  methods.  The 
first  method  involved  computing  the  unbiased  estimate  of  the  covariance  matrix  from 
Equation  2.4.  The  second  method  utilized  the  correlation  matrix  obtained  from  the 
covariance  coefficients  as  identified  in  Equation  2.5  and  the  third  method  used  the  direct 
correlation  as  found  from 


4.1 


As  noted  earlier,  computation  of  the  covariance  and  direct  correlation  matrices  is 
computationally  expensive.  Taking  advantage  of  the  symmetric  nature  of  the  statistics 
reduces  the  number  of  computations  required,  but  real-time  computing  of  these  values  is 
still  time  consuming  and  not  practical.  Table  4.1  highlights  this  and  depicts  the  number 
of  flops  required  for  the  two  prevalent  data  sets  from  the  Davis-Monthan  collect. 


Data  in  Mega 
Flops 

Samples,  Lines,  Bands 

320x960x210 

320x1280x210 

Band  Mean 
Computation 

64.57 

86.0836 

Covariance  and 
Direct  Correlation 

47417.5 

63223.984 

Correlation  Matrix 
from  Coefficients 

0.1764 

0.1764 

Image  to  PC 
Transfom  Average 

27096 

36128 

Table  4. 1 :  Statistics  Computation. 


The  direct  correlation  statistics  of  the  data  sets  were  found  from  Equation  4.1  and 


were  also  investigated  to  identify  the  effect  of  removing  the  mean  from  each  band. 
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Figure  4.2  is  a  density  slice  representation  of  the  three  statistical  matrices  for  two  Davis- 
Monthan  scenes  and  the  average  of  all  sixteen  scenes.  Notice  the  similar  structure  and 
intensity  values  for  the  same  matrix  type.  The  first  set  has  a  covariance  magnitude  on  the 
order  of  106,  direct  correlation  magnitude  of  1013  and  the  next  set  is  on  the  order  of  106 
and  10 12  respectively. 

Covariance  Correlation  Direct  Correlation 


50  100  1  50  200  50  100  150  200  50  100  150  200 

Figure  4.2:  Density  Slice  Representation  of  Statistics  Matrices.  First  row  -  Scene  One. 
Second  Row  -  Scene  Two.  Third  Row  -  Average. 

These  results  are  very  pleasing  in  that  they  graphically  depict  a  structural 
similarity  within  the  statistics  and  corresponds  very  well  to  results  obtained  by  Brower, 


et.  al.,  1996.  This  structural  similarity  will  be  capitalized  on  to  develop  an  invariant 
display  strategy. 

Of  the  sixteen  data  sets,  all  have  the  same  basic  characteristics  as  the  two  scenes 
shown  in  Figure  4.2,  and  as  depicted  by  the  average,  the  structure  remains  nearly  constant 
across  all  sixteen  scenes.  Further  investigation  into  the  physical  properties  of  the  data 
sets  revealed  that  the  mean  value  of  all  the  collects  were  also  structurally  similar.  Figure 
4.3  is  an  example  of  a  mean  obtained  from  the  Davis-Monthan  collects.  The  mean  value 
corresponds  to  the  average  radiance  for  that  particular  band.  This  implies  that  if  there  is  a 
similar  structure  within  the  collection  means,  then  there  should  be  a  similar  structure  for 
the  first  eigenvector  of  all  the  collections  since  the  first  eigenvector  is  a  representation  of 
radiance  (Figure  4.4).  Furthermore,  we  would  expect  that  since  the  covariance  and  direct 
correlation  matrices  are  derived  directly  from  the  pixel  vectors,  the  eigenvectors  and 
individual  pixel  vectors  will  display  similar  characteristic  behaviors,  i.e.,  the  pixel  vector 
spectrum  will  follow  the  behavior  of  the  eigenvector  nulls.  (Figure  4.5) 

Notice  also  that  the  correlation  matrices  have  the  same  scaling.  It  is  clear  from 
the  correlation  matrices  that  there  is  a  certain  degree  of  symmetry  between  the  bands. 
But,  it  must  be  remembered  that  the  correlation  matrix  as  shown  here  is  the  statistical 
correlation  between  the  bands  and  therefore  the  eigenvector  behavior  will  not  follow  that 
of  the  scene  content,  but  that  of  the  interband  correlation.  This  is  clearly  seen  in  Figure 
4.5  where  the  pixel  spectra  behavior  does  not  correspond  to  that  of  the  first  eigenvector  of 
the  correlation  matrix. 
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Figure  4.5:  Pixel  Spectra  Comparison  to  the  First  Eigenvector. 


Taking  this  further,  we  also  expect  that  as  the  eigenvector  number  increases  that 
the  eigenvector  behavior  will  be  become  more  directly  correlated  to  the  specifics  of  the 
particular  scene  we  are  studying,  implying  that  as  the  eigenvector  number  increases,  the 
individual  scene  eigenvectors  will  become  more  and  more  dissimilar  (Figure  4.6).  But,  if 
we  were  to  utilize  an  ‘average’  eigenvector,  we  would  assume  that  although  the  lower 
eigenvectors  may  no  longer  be  as  ‘close’  as  before,  that  overall,  the  behavior  would  be 
similar  across  all  scenes  that  the  average  was  taken  from.  When  this  comparison  is  made, 
we  see  that  the  lower  numbered  eigenvectors  remain  similar  and  the  higher  numbered 
eigenvectors  become  more  dissimilar  as  expected.  (Figures  4.7-8) 


64 


Fin!  Eigenvector  Comparison  Second  Etgemeclor  Ooropnison 


TMrd  Etjeiivtclcr  Comparison  Fourth  Eigenvector  Comparison 
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Figure  4.6:  Eigenvector  Comparison  for  Two  Scenes. 
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Rra!  EfgwrJtdor  Comparison 


S*j*nth  Eljtnvecfor  Comparison 
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Figure  4.7:  Eigenvector  Comparison  Between  the  Average  and  One  Scene. 
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Flrji  EtgervielorCcmp*hon  Sicord  Elgnssctw  Comp*rtton 


SMrth  Eigenvector  Coflipirtwn  Eighth  Elgtnvaeior  Comportson 


Figure  4.8:  Eigenvector  Comparison  Between  the  Average  and  Multiple  Scenes. 
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From  the  randomly  chosen  scene  comparisons  above,  it  is  clear  that  for  the 
particular  data  collections  from  Davis-Monthan,  the  average  eigenvector  up  to  number 
six  is  an  excellent  approximation  for  all  the  scenes.  It  must  also  be  noted  that  the  first 
three  eigenvalues  associated  with  the  Davis-Monthan  average  account  for  97.5  percent  of 
the  total  variance  (Figure  4.9)  and  eigenvalues  four  and  larger  only  comprise  2.5  percent 
of  the  overall  variance. 


Eigenvalue  plot  from  Covariance 
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Figure  4.9:  Eigenvector  Comparison  Between  the  Average  and  Multiple  Scenes. 


From  analysis  of  the  16  data  sets  we  can  also  conclude  that  the  behavior  of  the 
RGB  eigenvector  subset  discussed  in  chapter  three  should  also  be  similar.  Comparison  of 
the  average  Davis-Monthan  RGB  eigenvectors  with  three  random  scene  eigenvectors 
show  that  they  are  indeed  similar.  (Figure  4. 1 0) 
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Da^s-Morthan  Average  RG8  Eigenvectors 
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RGB  2nd  Eigenvector  Comparison 
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Figure  4.10:  RGB  Eigenvector  Comparison  Between  the  Average  and  Multiple  Scenes. 


The  next  question  to  be  investigated  is  whether  or  not  these  same  sets  of 
eigenvectors  can  be  applied  to  dissimilar  scenes  and  still  approximate  the  overall 
behavior  of  the  dissimilar  scene. 


C.  DISSIMILAR  SCENE  COMPARISONS 

For  a  comparative  analysis  of  the  eigenvectors  obtained  from  the  Davis-Monthan 
scenes,  a  data  set  from  Jasper  Ridge  and  a  set  from  Lake  Tahoe  will  be  utilized.  Jasper 
Ridge  was  chosen  because  the  data  was  obtained  from  a  different  sensor  (AVIRIS)  of  the 
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same  class  and  the  scene  is  dominated  by  vegetation.  Lake  Tahoe  was  chosen  because  it 
is  from  a  HYDICE  collect  which  contains  vegetation  and  a  large  section  of  water.  Both 
of  these  scenes  contrast  with  Davis-Monthan  in  that  they  contain  much  more  vegetation 
and  the  background  is  not  predominately  sand. 

The  scene  statistics  of  Jasper  Ridge  (Figure  4.11)  and  Lake  Tahoe  (Figure  4.12) 
also  follow  the  same  general  structure  as  Davis-Monthan.  The  significant  differences  of 
note  are  that  while  the  variance  in  the  Davis-Monthan  averages  are  highest  between 
bands  15-60,  the  variance  is  maximum  between  bands  40  and  60  for  Jasper  Ridge  and  55- 
70  for  Lake  Tahoe.  Also  of  important  note  is  the  difference  in  the  shape  of  the  mean. 
(Figure  4.13)  The  difference  in  shape  around  700nm  can  be  attributed  to  the  chlorophyll 
absorption  spectrum  that  is  characteristic  of  vegetation. 


Figure  4. 12:  Scene  Statistics  for  Lake  Tahoe. 
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Mean  Comparison 
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Figure  4.13:  Jasper  Ridge  and  Lake  Tahoe  Mean  Compared  to  Davis-Monthan  Average. 

With  the  exception  of  the  wavelengths  between  450nms  and  750nms,  the  behavior  of  the 
means  for  Jasper  Ridge  and  Lake  Tahoe  correspond  very  closely  to  that  of  the  Davis- 
Monthan  scene.  From  this  we  can  conclude  that  the  behavior  of  the  first  eigenvector  will 
also  be  similar  in  shape,  with  the  exception  of  the  chlorophyll  absorption  area,  as  shown 
in  Figure  4.14,  even  though  the  scene  constituents  are  different.  The  comparison  of 
subsequent  eigenvectors  is  not  as  simple  as  the  first,  because  as  noted  earlier,  as  the 
eigenvector  number  increases,  the  scene  specifics  will  begin  to  dictate  the  behavior  of  the 
eigenvector.  Figure  4.14  depicts  this  behavior.  Eigenvectors  number  one  and  two  from 
all  three  sets  are  similar  in  shape,  however,  starting  with  eigenvector  number  three,  the 
shapes  begin  to  diverge  significantly  from  the  Davis-Monthan  average. 
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Figure  4. 14:  Jasper  Ridge  and  Lake  Tahoe  Eigenvectors  Compared  to  Davis-Monthan 

Average. 
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Although  the  first  eigenvector  is  similar  for  all  scenes,  Jasper  Ridge  and  Lake  Tahoe’s 
eigenvectors  quickly  begin  to  diverge  from  the  Davis-Monthan  average  by  eigenvector 
number  three.  Also,  both  Jasper  Ridge  and  Lake  Tahoe’s  eigenvectors  are  similarly 
structured  from  the  first  eigenvector  up  through  the  fourth  eigenvector.  (The  Davis- 
Monthan  scenes  eigenvectors  remained  similarly  structured  up  through  eigenvector 
number  six.  (Figure  4.8))  This  would  seem  to  indicate  that  the  statistics  of  Jasper  Ridge 
and  Lake  Tahoe  are  more  alike  to  each  other  than  to  the  Davis-Monthan  statistics  and  that 
these  scenes  are  from  a  class  that  does  not  include  Davis-Monthan. 


A  comparison  of  the  Davis-Monthan  average  RGB  eigenvectors  to  Lake  Tahoe 
and  Jasper  Ridge  yields  similar  results.  (Figure  4.15)  As  expected,  the  first  eigenvectors 
which  correspond  to  the  overall  scene  color  composition  are  nearly  identical  while  the 
next  two  eigenvectors  diverge  due  to  variances  within  the  individual  scenes. 
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Figure  4. 15:  Jasper  Ridge  and  Lake  Tahoe  RGB  Eigenvectors  Compared  to  Davis- 

Monthan  Average. 
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D.  APPLICATION  OF  AVERAGES 


To  test  whether  averages  of  the  Davis-Monthan  scenes  could  be  applied  to  other 
scenes,  a  subset  of  13  data  sets  from  the  October  1995  collect  were  averaged  to  obtain  the 
PC  transformation  and  RGB  eigenvectors.  These  ‘global’  eigenvectors  were  then  applied 
to  two  different  scenes  to  obtain  the  PCs,  HSV  and  RGB  transformed  HSV  images.  The 
first  scene  for  comparison  was  also  from  Davis-Monthan,  but  collected  during  June  1995. 
The  second  scene  that  the  Davis-Monthan  averages  were  applied  to  was  the  Lake  Tahoe 
data  set.  The  Lake  Tahoe  data  set  was  chosen  for  comparison  because  the  data  was  also 
collected  with  the  HYDICE  sensor,  but  is  of  a  completely  dissimilar  scene  background. 

For  comparison  of  the  original  PCs  and  PCs  obtained  with  the  ‘global’  statistics, 
four  scene  images  are  presented.  The  first  is  the  grayscale  image  of  the  first  few 
principal  components.  (Figures  4.16a/b  and  4.17)  The  second  is  the  direct  RGB 
representation  of  the  principal  components,  including  the  variation  of  setting  saturation  to 
one,  value  to  one,  and  both  saturation  and  value  to  one.  (Figures  4.18a/b  and  4.19a)  The 
third  image  set  is  the  HSV  transformation  of  the  principal  components,  including  setting 
saturation  to  one,  value  to  one,  and  both  saturation  and  value  to  one.  (Figures  4.20a/b 
and  4.21a/b)  The  fourth  set  is  the  RGB  transformation  of  the  HSV  image.  (Figures 
4.22a^  and  4.23a/b) 

Upon  inspection,  the  grayscale  principal  component  images  appear  to  be  very 
similar  in  content  with  only  minor  variations  in  scale.  However,  the  direct  RGB 
comparisons  clearly  show  differences,  but  the  differences  are  similar  in  their  pixel 
locations,  indicating  that  they  contain  the  same  scene  information. 
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Figure  4.  !6a:  Davis-Monthan  PC  and  Test  Set  PC  Comparison.  Panel  A  is  the  Original 
(Scene  Specific)  PCI.  Panel  B  is  the  Test  (Average)  PCI.  Panel  C  is  the  Original  (Scene 
Specific)  PC2.  Panel  D  is  the  Test  (Average)  PC2. 
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igure  4.16b:  Davis-Monthan  and  Test  Transform  PC  Comparison.  Panel  A  is  the 
inal  PCS.  Panel  B  is  the  Test  PCS.  Panel  C  is  the  Original  PC4.  Panel  D  is  the  Test 
PC4.  (Note  that  the  iest  PC4  contains  more  variance  than  the  original  PC4.) 
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Original  PCs  Test  Set  PCs 


Figure  4.17:  Lake  Tahoe  and  Test  Transform  PC  Comparison.  (Note  that  the  higher 
numbered  test  PCs  contain  more  variance  than  their  counterpart  does.) 
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RGB  Representation  With  S=1 


Figure  4. 1 8a:  Davis-Monthan  Direct  RGB  Representation  of  Original  PCs  and  Test  PCs. 

Panels  A  and  C  are  from  the  Original  Data.  Panels  B  and  D  are  from  the  Test  Set. 
(Note  that  with  no  a  priori  knowledge  of  the  scene,  panels  A  and  B  may  be  construed  to 

contain  small  lakes  rather  than  fairways.) 
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RGB  Representation  With  V=1 
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Figure  4. 1 8b:  Davis-Monthan  Direct  RGB  Representation  of  Original  PCs  and  Test  PCs. 
Panels  A  and  C  are  from  the  Original  Data.  Panels  B  and  D  are  from  the  Test  Set. 
(Note  the  similarity  between  panels  A  and  B,  and  C  and  D.) 
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RGB  Representation  With  S=1  RGB  Representation  With  S=1 


RGB  Representation  With  V=S=1  RGB  Representation  With  V=S=1 


Figure  4. 19:  Lake  Tahoe  Direct  RGB  Representation  of  Original  PCs  and  Test  PCs. 

(Note  the  similarity  of  scene  content.) 
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HSV  FROM  COVARIANCE.  COLOR  RESHAPED 


HSV  FROM  OOVARIANCE,  COLOR  RESHAPED  (TEST  SET) 


HSV  FROM  OOVARIANCE.  COLOR  RESHAPED  AND  S=  1 


HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  S=1  (TEST  SET) 


Figure  4.20a:  Davis-Monthan  HSV  and  Test  HSV  Comparison.  Panels  A  and  C  are 
from  the  Original  Data.  Panels  B  and  D  are  from  the  Test  Set. 
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HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  V=1 


HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  V=S=1 


HSV  FROM  COVARIANCE.  COLOR  RESHAPED  AND  V=S=1  (TEST  SET) 


Figure  4.20b:  Davis-Monthan  HSV  and  Test  HSV  Comparison.  Panels  A  and  C  are 
from  the  Original  Data.  Panels  B  and  D  are  from  the  Test  Set. 
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HSV  FROM  COVARIANCE,  COLOR  RESHAPED  HSV  FROM  COVARIANCE,  COLOR  RESHAPED  (TEST  SET) 


HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  S=1  HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  S=1  (TEST  SET) 


Figure  4.21a:  Lake  Tahoe  HSV  and  Test  HSV  Comparison. 


The  images  of  Figure  4.20a  panel  A  and  4.21a  panel  A  makes  it  easier  to  view,  as 

compared  with  images  like  those  in  Figure  4.18a  panel  A  and  Figure  4.19  panel  A. 

Pixels  that  do  not  have  a  significant  projection  onto  either  PC2  or  PC3  appear  largely 

desaturated.  This  makes  the  image  easier  to  view  because  naturally  occurring  scenes  tend 

to  be  largely  desaturated  with  low  dimensionalities  in  the  visible  portion  of  the  spectrum. 

(Buchsbaum  and  Gottschalk,1983)  Human  observers  are  not  used  to  examining  images 

with  large  regions  of  highly  saturated  hues  that  vary  rapidly. 
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HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  V=1 


HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  V=1  (TEST  SET) 
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HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  V=S=1  HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  V=S=1  (TEST  SET) 


Figure  4.21b:  Lake  Tahoe  HSV  and  Test  HSV  Comparison. 

The  image  sets  in  which  the  saturation  and  values  are  set  to  one  provides  a  way  to 
go  back  and  forth  between  highly  saturated  and  less  saturated  images  to  make  material 
classification  more  obvious.  From  these  figures  it  is  clear  to  see  that  while  the  colors  of 
the  original  PC  and  test  PC  sets  may  be  different,  the  scene  content  remains  the  same  and 
visible. 
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HSV  TRANSFORMED 


HSV  TRANSFORMED  (TEST  SET) 
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HSV  TRANSFORMED.  COLOR  RESHAPED  AND  S=  1 
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HSV  FROM  COVARIANCE,  COLOR  RESHAPED  AND  S=1  (TEST  SET) 


Figure  4.22a:  Davis-Monthan  RGB  Transformed  HSV  and  Test  RGB  Transformed  HSV 
Comparison.  Panels  A  and  C  are  from  the  Original  Data.  Panels  B  and  D  are  from  the 

Test  Set. 
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Figure  4.22b:  Davis-Monthan  RGB  Transformed  HSV  and  Test  RGB  Transformed  HSV 
Comparison.  Panels  A  and  C  are  from  the  Original  Data.  Panels  B  and  D  are  from  the 

Test  Set. 
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RGB  Transformed 


RGB  Transformed  Test  Set 


HSV  TRANSFORMED  TRANSFORMED  HSV,  COLOR  ROTATED  (TEST  SET) 


HSV  TRANSFORMED,  COLOR  RESHAPED  AND  S=1  HSV  FROM  COVARIANCE,  COLOR  ROTATED  AND  S=1  (TEST  SET) 
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Figure  4.23a:  Lake  Tahoe  RGB  Transformed  HSV  and  Test  RGB  Transformed  HSV 

Comparison. 

Figure  4.23  makes  an  interesting  comparison  between  the  RGB  transformed 
images.  The  images  on  the  left  which  are  directly  transformed  from  their  own  scene 
information  clearly  show  the  dominance  of  the  ‘blue’  band  corresponding  to  the  high 
amount  of  water  represented  in  the  scene  while  the  images  on  the  right  which  were 
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transformed  with  the  ‘global  RGB  eigenvectors  maintains  the  appearance  of  the  HSV 
image. 

RGB  Transformed  RGB  Transformed  Test  Set 

HSV  TRANSFORMED,  COLOR  RESHAPED  AND  V=1  HSV  FROM  COVARIANCE,  COLOR  ROTATED  AND  V=1  (TESTSET) 


HSV  TRANSFORMED,  COLOR  RESHAPED  AND  V=S=1  HSV  FROM  COVARIANCE,  COLOR  ROTATED  AND  V=S=1  (TESTSET) 


Figure  4.23b:  Lake  Tahoe  RGB  Transformed  HSV  and  Test  RGB  Transformed  HSV 

Comparison. 
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V.  SUMMARY  AND  CONCLUSIONS 


From  analysis  and  comparison  of  the  Davis-Monthan  data  to  dissimilar  scenes,  it 
is  clear  to  see  that  for  best  analysis,  it  would  be  appropriate  to  maintain  scenes  such  as 
Davis-Monthan  within  one  group  and  scenes  such  as  Jasper  Ridge  and  Lake  Tahoe  within 
another  group.  But,  for  first  order  unsupervised  classification,  the  first  few  eigenvalues 
and  associated  eigenvectors  which  contain  the  largest  amount  of  scene  variance  can 
appropriately  represent  the  scene.  Figure  5.1  and  Table  5.1  reinforce  this  by  showing  that 
over  98  percent  of  scene  variance  is  contained  within  the  first  three  eigenvectors.  In  fact, 
over  95  percent  is  located  within  the  first  two.  Extending  this  concept  further  and 
drawing  upon  the  results  in  the  previous  sections,  it  is  clear  that  a  generalized  ‘global’  set 
of  eigenvectors  can  appropriately  depict  any  scene  content.  The  average  eigenvectors 
investigated  in  this  study  provides  such  a  basis  and  can  be  further  improved  upon  with  an 
increase  in  the  number  of  data  sets  utilized. 


Band  Number 


Figure  5.1:  Jasper  Ridge  and  Lake  Tahoe  Eigenvectors  Compared  to  Davis-Monthan 

Average. 
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Eigenvalue 

Davis-Monthan 

Jasper  Ridge 

Lake  Tahoe 

1 

0.8417 

0.6164 

0.8093 

2 

0.1159 

0.3479 

0.1629 

3 

0.0259 

0.0271 

0.0147 

4 

0.0056 

0.0026 

0.0092 

5 

0.0028 

0.0019 

0.0022 

Table  5. 1 :  Eigenvalues  for  Davis-Monthan  Average,  Jasper  Ridge,  and  Lake 


Tahoe. 


Table  5.1  along  with  the  figures  in  chapter  four  graphically  depict  the  fact  that  the 
first  three  eigenvectors  are  the  most  important  in  describing  any  scene.  Referring  back  to 
section  3C  on  color  mapping  strategies,  it  is  clear  that  a  combination  of  the  first  three 
principal  component  transforms  will  appropriately  depict  any  scene. 

The  principal  component-based  mapping  strategy  discussed  previously  provides 
an  easy  way  to  perform  first  order  unsupervised  classification.  The  inclusion  and 
utilization  of  ‘global’  or  generalized  eigenvectors  decreases  the  overhead  required  to 
perform  the  first  order  classification  and  allows  for  ‘real  time’  classification  of 
hyperspectral  imagery.  The  resultant  image  is  segmented  spatially  based  on  a 
generalized  projection  of  the  radiance  distribution  in  the  PC2-PC3  plane.  By  visually 
inspecting  the  resulting  image,  an  analyst  can  then  direct  attention  to  appropriate  areas  of 
the  scene  for  further  processing  without  the  time  consuming  requirement  of  calculating 
the  scene  specific  statistics. 

The  PC  and  RGB  transformation  eigenvectors  utilized  in  this  study  were  derived 
by  averaging  the  scene  statistics  from  13  similar  scenes.  These  ‘global’  or  generalized 
eigenvectors  were  then  applied  to  a  similar  and  dissimilar  scene  to  observe  the  effect.  It 
is  clear  that  these  eigenvectors  appropriately  allowed  for  first  order  classification  and  can 
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be  applied  to  a  broad  range  of  spectral  imagery  classes.  These  eigenvectors  can  become 
even  more  robust  as  the  number  of  ‘averaged’  scenes  is  increased. 

It  was  shown  that  the  1st  PC  will  always  be  related  to  the  mean  solar  radiance,  but 
the  2nd,  3rd  and  subsequent  PCs  depend  on  the  specific  contents  of  the  image.  However,  it 
was  also  shown  that  only  the  first  three  PCs  are  required  for  a  color  mapping 
corresponding  to  human  color  vision.  The  remapping  of  the  MATLAB  colorspace  in 
chapter  three  provides  an  orthogonal  mapping  of  the  colors,  but  requires  further 
refinement  of  the  colorspace.  It  remains  to  be  investigated  whether  or  not  the  RGB 
transformation  of  the  HSV  image  presented  here  can  be  arranged  so  that  materials  are 
presented  in  a  straightforward  manner,  i.e.  water  always  mapped  to  blue,  vegetation  to 
green,  etc,  vice  having  the  dominant  scene  constituent  set  the  base  hue  of  the  image.  The 
author  believes  that  this  mapping  can  be  accomplished. 

The  presentation  strategy  discussed  here  is  best  suited  to  broad  scale  geographical 
classification,  not  for  identifying  small,  isolated  targets.  However,  objects  and  variances 
within  the  scene  which  occur  only  at  a  few  pixels  in  an  image  and  thus  have  little  effect 
on  the  overall  covariance  matrix  and  do  not  contribute  significantly  to  the  2nd  and  3rd  PCs, 
appear  to  stand  out  and  be  discemable  in  this  mapping  strategy.  For  this  reason,  this 
aspect  of  the  mapping  strategy  merits  further  investigation. 

The  invariant  display  strategy  and  generalized  eigenvectors  presented  here  is 
offered  as  a  way  to  have  a  first  look  at  a  wide  variety  of  spectral  scenes.  By  performing  a 
PC  transformation  with  these  eigenvectors  and  analyzing  the  three  most  significant  PCs, 
an  initial  classification  decision  can  be  made  ‘real  time’.  Detailed  investigation  of  the 
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relationship  between  the  PC  eigenvectors  and  dissimilar  image  content  shows  that  this 
strategy  is  robust  enough  to  provide  an  accurate  initial  scene  classification. 


92 


MATLAB  CODE 


Two  MATLAB  code  files  are  scripted  below.  The  first  was  utilized  for  conversion  to 
Band  Interleave  format  and  the  second  was  utilized  to  generate  the  statistics  and  principal 
components  for  analysis. 


9~**  +  *  +  +  +  +  +  *  +  +  *  +  +  +  +  *  +  **  +  +  +  *********  +  ****  +  *  +  +  *  +  **  +  +  +  +  +  *  +  +  *******'k'kmk-k'k-k'k'kmk'k 
O 

%This  program  will  read  in  a  3-D  Hyperspectral  Data  set  from  ENVI  format 

o. 

o 

%  by 

%  David  I.  Diersen 

%  October  2000 

o, 

o 

clear 

%**+********+********+**++***********+************+***********+**+****** 
% HYPERSPECTRAL  DATA  FILE  READER 

o 

'o 

%  WRITTEN  PRIMARILY  FOR  ' BSQ'  BYTE  ORDER  0-1  AND  'BIL'  BYTE  ORDER  0-1 

Q. 

O 

%  THIS  FIRST  SECTION  GETS  THE  NECESSARY  INFO  ON  THE  FILE  TO  PROCESSED 

O 

"O 

filein=input ( '  What  file  do  you  want  to  process?  1 , ’ s  1  ) ; 
type=str2num (input ( 1  What  is  the  data  type?  ( 1, 2, 3, 4 , 5, or6)  ','sf)); 
if  type  —  1 

type=char ( 1 int8 1 )  ; 
elseif  type— 2 

type=char ( 1 intl6 1  )  ; 
elseif  type==3 

type=char ( 1 int32 ' )  ; 

elseif  type— 4  %Not  dealing  with  4,  5  and  right  now 

t ype=char ( float ) ; 
elseif  type==5 

type=char (doubleprecision)  ; 
elseif  type==6 

type=char (complex) ; 

else  string ( 1  You  made  a  mistake,  BOOM!  ') 
end 

byte=str2num (input (' What  is  the  byte  order?  (0=MS-DOS,  1=IEEE)  ’ ^  ' s 1 ) ) ; 
inter=str2num (input ( T What  is  the  interleave  type?  ( 1=BSQ, 2=BIP, 3=BIL) 

'  ^  *  s  1  )  )  ; 

l=str2num  (input  ( ?  What  is  the  number  of  samples?  1 ,'s*)); 

%Obtains  the  size  of  original  data 

w=str2num ( input ( 1  What  is  the  number  of  rows?  f,!s’)); 
dim=str2num (input ( 1  How  many  bands  do  you  have?  T , 1 s 1 ) ) ; 
hdrsz=str2num ( input ( 1  How  long  is  the  header?  ’,'s’)); 

o 

%  READ  IN  THE  FILE 

o, 

t> 

f idl=f open (f ilein) ;  %Open  the  binary  file 

if  inter— 1  %Works  on  BSQ  format  files 
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if  byte==l  %Byte  order  is  1 

hdr=fseek (fidl, hdrsz, 'bof ' ) ;  %Sets  the  pointer  to  right  after 
the  header 

tmp=f read { fidl,  [1,  (l*dim*w*2) ]  ,  Tint8' ) ; 

%Reads  the  data  in  1  byte  increments  (Splits  data  in  half) 
tmp2=reshape (tmp, 2,  (length (tmp) /2) ) ; 

%These  next  three  lines  swaps  bytes 
tmp3=flipud (tmp2) ; 

tmp4=reshape (tmp3, 1, length (tmp) ) ;  %Data  is  now  in  LSB-MSB  order 
f id2=fopen ( 1 tempdata ’ , ' w 1 ) ;  %Creates  a  temporary  file 

count=fwrite (fid2, tmp4 , ' int8  f )  ; 

%Writes  the  Byte  swapped  data  to  binary  file  ro  reread  as  LSB-MSB 
f id2=f open ( 1 tempdata 1 ) ;  %Open  the  temp  file 

data=fread ( f id2, [ 1 , (l*w*dim) ] , 1 intl6 1 ) ;  %Reads  the  data  as  an 
integer 

dat=reshape (data, 1, w, dim) ; 

[led] =size (dat ) 
for  i=l:d 

%Reorders  and  transposes  data  because  of  read  in  issues 
AA( : , : , i) =dat ( : , : , i) T ; 

end 

fclose (fid2) ; 

delete  tempdata  %Deletes  the  temporary  read  file 

else  %Byte  order  is  0 

two_D_file=f read (fidl, [1, (l*dim*w) ] , type) ; 
dat=reshape (two_D_f ile, 1, w, dim)  ; 

[1  c  d]=size(dat) 
for  i=l:d 

%Reorders  and  transposes  data  because  of  read  in  issues,  memory  size 
AA ( : , : , i ) =dat ( : , : , i ) 1 ; 

end 

end 

elseif  inter==2 

two_D_file=f read ( fidl ,  [1,  (l*w*dim) ] , type) ; 
elseif  inter==3  %Works  on  BIL  format  files 

if  byte==l 

fseek (fidl, hdrsz, 1 bof 1 ) ;  %Sets  the  pointer  to  right  after 

the  header 

f id2=fopen ( 1 tempdata T , 'w' ) ;  %Creates  a  temporary  file 
tmp= ( flipud ( fread ( fidl, [2, (l*w*dim) /4 ] , 1 int8 ' ) ) ) ; 

%Reads  the  data  in  1  byte  increments  (Splits  data  up  because  of  size) 
posit=ftell (fidl) ; 
fwrite (fid2, tmp, 1 int8 1 ) ; 
clear  tmp 

fseek (fidl, posit, fbof 1 ) ; 

%This  could/should  be  shortened  with  a  "for"  loop 

tmp=( flipud (fread (fidl, [2, (l*w*dim) /4] ,  rint8 1 ) ) ) ; 

clear  posit 

posit=ftell (fid2) ; 

fseek ( f id2 , posit, Tbof ' ) ; 

fwrite ( f id2, tmp, ' int8 ’ ) ; 

clear  tmp 

tmp= (flipud ( fread ( fidl,  [2,  (l*w*dim) / 4 ] ,  T int8  ? ) ) ) ; 
fwrite (fid2, tmp, 1 int8 ? ) ; 
clear  tmp 
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tmp= (flipud (fread (fidl, [2,  (l*w*dim) / 4 ] , 1 int8 '  )  )  )  ; 
fwrite (f id2, tmp, 1 int8 ’ )  ; 
clear  tmp 

f id2=f open { 1 tempdata ')  ;  %Open  the  temp  file 

for  i=l:w/4  %Reads  the  file  into  a  concatenated  2-D  matrix 

xl (i,  : ) =fread (fid2,  [ 1,  (l*dim) ] ,  *  inti 6  T ) ; 

end 

save  X  xl 
clear  i  xl 

for  i=l:w/4  %Reads  the  file  into  a  concatenated  2-D  matrix 

x2 (i,  :  ) =fread (fid2,  [1,  (l*dim) ] ,  1  inti 6  T ) ; 

end 

save  Y  x2 
clear  i  x2 

for  i=l:w/4  IReads  the  file  into  a  concatenated  2-D  matrix 

x3 (i,  : ) =fread (fid2,  [1,  (l*dim) ] ,  ' inti 6 ' ) ; 

end 

save  XX  x3 
clear  i  x3 

for  i=l:w/4  %Reads  the  file  into  a  concatonated  2-D  matrix 

x4 (i,  : ) = fread (fid2,  [1,  (l*dim) ] ,  ' inti 6 1 ) ; 

end 

save  YY  x4 
clear  i  x2 
fclose (f id2 ) ; 
delete  tempdata 
load  X 

load  Y  %Correctly  saved 

load  XX 
load  YY 

fid3=fopen (' tempdata 1 w' ) ;  %Creates  a  temporary  file 
for  i==l :  ( l*dim) 

fwrite (f id3, xl ( : , i)  , ' inti 6 1 ) ; 
fwrite ( f id3, x2 ( : , i ) ,  1  inti 6  ? ) ; 
fwrite (fid3, x3 ( : , i) , ’ intl6* ) ; 
fwrite (fid3,x4(:,i)f  f int 16  * ) ; 

end 

clear  xl  x2  x3  x4 
fclose ( f id3) ; 
f id4=f open ( ' tempdata ' ) ; 

AA=fread ( f id4 ,  [w,  (l*dim) ] ,  ' int 16 T ) ; 

AA=reshape ( AA, w, 1 ,  dim)  ; 
fclose ( f id4 ) ; 
delete  tempdata 

clear  fid2  fid3  fid4  hdr  tmp  tmp2  tmp3  tmp4  count  X  Y 
else 

two_D_f ile=f read (fidl, [1, (l^w^dim) ] , type) ; 
dat=re shape ( two_D_f ile, l*dim^  w) ; 
tmp=dat 1 ; 

AA=reshape ( tmp, 1, wf dim) ;  %Resizes  the  file  into  correct  bands 
and  size 

AA=reshape ( AA, w, 1 , dim) ; 

end 

end 

fclose (fidl) ; 
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clear  type  1  w  dim  fidl  inter  i  c  d  byte  count  dat  data  hdrsz  posit 
%***  +  *********************************************************■*:********  + 

%  THE  OUTPUT  OF  THE  ABOVE  BLOCK  IS  THE  HYPERSPECTAL  DATA  CUBE  LABELED  AA 

%This  section  takes  in  the  3-D  data  set  (AA)  and  processes  the  stats 
% (Mean,  Covariance,  Correlation,  Eigenvalues,  Eigenvectors,  and 
Principal  Components) 

% 

clear 

load  erf 10m012origdata  %DATA  IN  MATLAB  FORMAT 

save  tmp2  AA 

o 

"o 

%*********************************************************************** 
[1  c  d]=size(AA);  %Obtains  the  #  rows,  columns,  and  bands 

N=l*c;  INumber  of  pixels  in  the  scene 

%  FIND  THE  MEAN 

for  i=l:d  %Applied  for  each  band 

mean_ (i) = { 1/N) * (sum (sum (AA { : , : , i ) ) ) ) ; 

% Finds  the  mean  for  each  band,  vector  form 
end 

mean_=mean_f ;  %Mean  into  column  form 

save  stats2  mean_ 

%  SUBTRACT  THE  MEAN  FROM  EACH  BAND 
for  i=l:d 

AA ( : ,  :  ,  i )  =  ( AA ( : ,  : , i) -mean_(i) ) ; 

%Subtracts  the  mean  from  each  band  and  replaces  AA 
end 

clear  i  x 

%The  above  works  well  and  fast.  Now  working  by  pixels,  slows  down 
%  FIND  THE  CORRELATION  AND  COVARIANCE  MATRIX 
k-1;  %Initialize  the  pixel  count 

Cov=zeros (d, d) ;  %Initialize  covariance  matrix 

Auto_Corr=zeros (d, d) ;  Unitialize  Auto  Correlation  matrix 

for  i=l:N 

%Run  for  each  pixel  THIS  TAKES  A  LONG,  LONG,  LONG  TIME . 

z=k: (l*c) : (l*c*d) ; 

cov= ( (1/ (N-l ) ) . * (AA ( z ) 1 *AA ( z ) ) ) ;  %Normalized  pixel  covariance 
corr= ( (AA ( z)  1 ) +mean_)  * (AA  ( z )  +  (mean_' ) ) ;  %Adds  mean  back  in 
%Get  the  Auto  correlation  at  the  same  time  E[(x) (x)1] 

Cov-Cov+cov;  %Summing  over  entirety 

Auto_Corr=Auto_Corr+corr;  %Summing  over  entirety 
clear  cov  corr 

k=k+l;  %Advances  pixel 

end 

save  stats2  Cov  Auto_Corr  -append 
clear  AA  i  j  k  tmp  z  mean_ 

[S_C  Eval  Evect ] =svd (Cov) ;  %SVD  decomp  to  get  evals  and  evectors 

save  stats2  S_C  Eval  Evect  -append  %Save  stats  to  save  memory 
clear  S__C  Eval  Evect 

% 

%  NOW  GET  THE  CORRELATION  MATRIX  WITH  COEFFICIENTS 

% 

D= (diag (Cov) ) * (diag (Cov) ) ' ; 

%Use  the  diagonal  of  the  cov  matrix  to  get  the  denominator  of  corr 
matrix 
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for  i=l : length (Cov) A2 

Corr (i) = (Cov (i) ) /sqrt (D (i) ) ;  %Builds  Correlation  vector 

end  %Next  reshapes  correlation  matrix 

Corr=reshape (Corr, sqrt  (length (Corr) ) , sqrt (length (Corr) ) ) ; 

[S_Corr  EvaljCorr  Evect_Corr] =svd (Corr ) ; 

%SVD  decomp  to  get  evals  and  evectors  of  the  Correlation  matrix 
clear  x  i  D  1  c  d 

save  stats2  Corr  S_Corr  Eval_Corr  Evect_Corr  -append 
clear  Corr  S__Corr  Eval_Corr  Evect_Corr  Cov 

o 

o 

%  NOW  THE  DECOMPOSITION  OF  E[x*x'] 

o. 

x> 

[S_AC  Eval_AC  Evect_AC]  =svd (Auto_Corr )  ; 

%SVD  decomp  to  get  evals  and  evectors 
save  stats2  S_AC  Eval_AC  Evect_AC  -append 
clear  Auto__Corr  S__AC  Eval_AC  Evect__AC 

o 

%  This  section  will  find  the  principal  components,  pixel  by  pixel 
% 

clear 

load  stats2  %Loads  the  statistics  file 

load  tmp2  %Loads  the  MATLAB  format  original  data 

[ 1, c, d] =size (AA)  ;  %Rows,  columns  and  dim  of  orig  data 

k=l ; 

N= ( l*c) ; 

for  i=l : N  %Run  for  each  pixel,  THIS  TAKES  A  LONG,  LONG  TIME. - 

z=k: (l*c) : (l*c*d) ; 

AA ( z ) = (Evect ' ) * (AA ( z)  f ) ;  %Transform  from  covariance 
k-k+1; 

end 

PC=AA;  %Creates  principal  component  transform  variable 

clear  AA 
i=l : 25; 

PCs ( : , : , i) =PC ( : , : , i ) ; %Only  want  the  first  25  Principal  components 

clear  PC 

PC-PCs; 

save  xforms2  PC  ISaves  to  a  MATLAB  file 

clear  PC  i  k  z  AA 

clear  %Clears  to  keep  memory  from  becoming  full 

load  stats2  %As  above 

load  tmp2 

[1, c, d] =size  (AA)  ;  %Rows,  columns  and  dim  of  orig  data 
k=l  ; 

N= (l*c) ; 

for  i=l:N  %Run  for  each  pixel,  THIS  TAKES  A  LONG  TIME . 

z=k: (l*c) : (l*c*d) ; 

AA ( z ) = ( Evect_Corr 1 ) * (AA ( z ) 1 ) ; % Trans form  from  coefficients 
k=k+l; 

end 

PC_R=AA; 
clear  AA 
i=l : 25; 

PCs ( : ,  : , i) =PC_R ( :  ,  : , i) ; %Only  want  the  first  25  transforms 
clear  PC_R 
PC  R=PCs ; 
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save  xforms2  PC_R  -append  %Saves  the  transfom  from  coefficients 
clear  PC_R  i  k  z  AA 

clear  %Make  more  memory  room 

load  stats2 
load  tmp2 

[1 , c, d] =size (AA) ;  %Rows,  columns  and  dim  of  orig  data 

k=l ; 

N= ( l*c ) ; 

for  i=l:N  %Run  for  each  pixel,  THIS  TAKES  A  LONG  TIME. 

z=k:  (l*c) :  (l*c*d) ; 

AA(z) = (Evect_AC 1 ) * (AA(z) 1 ) ;  %Transform  from  E[xx’] 
k=k+l; 

end 

PC_AC=AA; 
clear  AA 
i=l : 25; 

PCs ( : , : , i) =PC_AC ( : , : ,i) ;  %Only  want  the  first  25  transforms 
clear  PC_AC 
PC_AC=PCs ; 

save  xforms2  PC_AC  -append 
clear  PC_AC  i  k  z  AA  c  d  N 

^***************************************************************** 

o 

%  Outputs  three  files,  tmp2  which  is  the  original  data  file,  stats2  % 

%  which  is  the  statistics  from  the  three  methods,  and  xforms2  which  is 
%  the  first  25  bands  of  the  principal  component  transforms. 
^***************************************************************** 
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